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IMPROVED EXPRESSION OF HIV POLYPEPTIDES AND 
PRODUCTION OF VIRUS-LIKE PARTICLES 

Cross -Reference to Related Applications 

This application is related to provisional patent 
applications serial nos. 60/114,495, filed December 31, 1998 
and 60/168,471, filed December 1, 1999, from which priority 
is claimed under 35 USC §119 (e) (1) and which applications 
are incorporated herein by reference in their entireties. 

Technical Field 

Synthetic expression cassettes encoding the HIV 
polypeptides (e.g., Gag-, pol-, prot-, reverse 
transcriptase, Env- or tat -containing polypeptides) are 
described, as are uses of the expression cassettes. The 
present invention relates to the efficient expression of HIV 
polypeptides in a variety of cell types. Further, the 
invention provides methods of producing Virus-Like Particles 
(VLPs) , as well as, uses of the VLPs and high level 
expression of oligomeric envelope proteins. 

Background of the Invention 

Acquired immune deficiency syndrome (AIDS) is 
recognized as one of the greatest health threats facing 
modern medicine. There is, as yet, no cure for this 
disease . 
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In 1983-1984, three groups independently identified the 
suspected etiological agent of AIDS. See, e.g., 
Barre-Sinoussi et al . (1983) Science 220:868-871; 
Montagnier et al., in Human T-Cell Leukemia Viruses (Gallo, 
Essex Sc Gross, eds . , 1984); Vilmer et al . (1984) The Lancet 
1:753; Popovic et al . (1984) Science 224:497-500; Levy et 
al. (1984) Science 225:840-842. These isolates were 
variously called lymphadenopathy-associated virus (LAV) , 
human T-cell lymphotropic virus type III (HTLV-III) , or 
AIDS-associated retrovirus (ARV) . All of these isolates are 
strains of the same virus, and were later collectively 
named Human Immunodeficiency Virus (HIV) . With the 
isolation of a related AIDS-causing virus, the strains 
originally called HIV are now termed HIV-1 and the related 
virus is called HIV-2 See, e.g., Guyader et al . (1987) 
Nature 326:662-669; Brun-Vezinet et al . (1986) Science 
233:343-346; Clavel et al . (1986) Nature 324:691-695. 

A great deal of information has been gathered about the 
HIV virus, however, to date an effective vaccine has not 
been identified. Several targets for vaccine development 
have been examined including the env, Gag, pol and tat gene 
products encoded by HIV. 

Haas, et al . , {Current Biology 6 (3) : 315-324 , 1996) 
suggested that selective codon usage by HIV-1 appeared to 
account for a substantial fraction of the inefficiency of 
viral protein synthesis. Andre, et al., (J. Virol. 
72 (2) :1497-1503, 1998) described an increased immune 
response elicited by DNA vaccination employing a synthetic 
gpl20 sequence with optimized codon usage. Schneider, et 
al., {J Virol. 71 (7) :4892-4903 , 1997) discuss inactivation 
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of inhibitory (or instability) elements (INS) located within 
the coding sequences of the Gag and Gag-protease coding 
sequences . 

The Gag proteins of HIV-1 are necessary for the 
assembly of virus-like particles. HIV-1 Gag proteins are 
involved in many stages of the life cycle of the virus 
including, assembly, virion maturation after particle 
release, and early post-entry steps in virus replication. 
The roles of HIV-1 Gag proteins are numerous and complex 
(Freed, E.O., Virology 251 : 1-15 , 1998). 

Wolf, et al., (PCT International Application, WO 
96/30523, published 3 October 1996; European Patent 
Application, Publication No. 0 44 9 116 Al, published 2 
October 1991) have described the use of altered pr55 Gag of 
HIV-1 to act as a non- infectious retroviral -like particulate 
carrier, in particular, for the presentation of 
immunologically important epitopes. Wang, et al., (Virology 
200:524-534, 1994) describe a system to study assembly of 
HIV Gag-p-galactosidase fusion proteins into virions. They 
describe the construction of sequences encoding HIV Gag- 3- 
galactosidase fusion proteins, the expression of such 
sequences in the presence of HIV Gag proteins, and assembly 
of these proteins into virus particles. 

Recently, Shiver, et al . , (PCT International 
Application, WO 98/34640, published 13 August 1998) 
described altering HIV-1 (CAM1) Gag coding sequences to 
produce synthetic DNA molecules encoding HIV Gag and 
modifications of HIV Gag. The codons of the synthetic 
molecules were codons preferred by a projected host cell. 

The envelope protein of HIV-1 is a glycoprotein of 
about 160 kD (gpl60) . During virus infection of the host 
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cell, gpl60 is cleaved by host cell proteases to form gpl20 
and the integral membrane protein, gp41. The gp41 portion 
is anchored in (and spans) the membrane bilayer of virion, 
while the gpl20 segment protrudes into the surrounding 
environment. As there is no covalent attachment between 
gpl2 0 and gp41, free gpl2 0 is released from the surface of 
virions and infected cells. 

Haas, et al ., {Current Biology 6 (3) : 315-324 , 1996) 
suggested that selective codon usage by HIV-1 appeared to 
account for a substantial fraction of the inefficiency of 
viral protein synthesis. Andre, et al . , (J. Virol. 
72 (2) :1497-1503 , 1998) described an increased immune 
response elicited by DNA vaccination employing a synthetic 
gpl2 0 sequence with optimized codon usage. 

Summary of the Invention 

The present invention relates to improved expression of 
HIV Env-, tat-, pol-, prot-, reverse transcriptase, or Gag- 
containing polypeptides and production of virus-like 
particles . 

In one embodiment the present invention includes an 
expression cassette, comprising a polynucleotide encoding an 
HIV Gag polypeptide comprising a sequence having at least 
90% sequence identity to the sequence presented as SEQ ID 
NO:20. In certain embodiments, the polynucleotide sequence 
encoding said Gag polypeptide comprises a sequence having at 
least 90% sequence identity to the sequence presented as SEQ 
ID NO: 9 or SEQ ID NO : 4 . The expression cassettes may 
further include a polynucleotide sequence encoding an HIV 
protease polypeptide, for example a nucleotide sequence 
having at least 90% sequence identity to a sequence selected 
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from the group consisting of: SEQ ID NO:5 7 SEQ ID NO: 78, 
and SEQ ID NO: 79. The expression cassettes may further 
include a polynucleotide sequence encoding an HIV reverse 
transcriptase polypeptide, for example a sequence having at 
least 90% sequence identity to a sequence selected from the 
group consisting of: SEQ ID NO: 80, SEQ ID NO: 81, SEQ- ID 
NO: 82, SEQ ID NO: 83, and SEQ ID NO: 84. The expression 
cassettes may further include a polynucleotide sequence 
encoding an HIV tat polypeptide, for example a sequence 
selected from the group consisting of: SEQ ID NO: 87, SEQ ID 
NO: 88, and SEQ ID NO: 89. The expression cassettes may 
further include a polynucleotide sequence encoding an HIV 
polymerase polypeptide, for example a sequence having at 
least 90% sequence identity to the sequence presented as SEQ 
ID NO: 6. The expression cassettes may include a 
polynucleotide sequence encoding an HIV polymerase 
polypeptide, wherein (i) the nucleotide sequence encoding 
said polypeptide comprises a sequence having at least 90% 
sequence identity to the sequence presented as SEQ ID NO: 4, 
and (ii) wherein the sequence is modified by deletions of 
coding regions corresponding to reverse transcriptase and 
integrase. The expression cassettes described above may 
preserves T-helper cell and CTL epitopes. The expression 
cassettes may further include a polynucleotide sequence 
encoding an HCV core polypeptide, for example a sequence 
having at least 90% sequence identity to the sequence 
presented as SEQ ID NO: 7. 

In another aspect, the invention includes an expression 
cassette, comprising a polynucleotide sequence encoding a 
polypeptide including an HIV Env polypeptide, wherein the 
polynucleotide sequence encoding said Env polypeptide 
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comprises a sequence having at least 90% sequence identity 
to SEQ ID NO: 71 (Figure 58) or SEQ ID NO: 72 (Figure 59) . In 
certain embodiments, the Env expression cassettes includes 
sequences flanking a VI region but have a deletion in the VI 
region itself, for example the sequence presented as SEQ ID 
NO:65 (Figure 52, gpl60 .modUS4 . del VI) . In certain 
embodiments, the Env expression cassettes, include sequences 
flanking a V2 region but have a deletion in the V2 region 
itself, for example the sequences shown in SEQ ID NO: 60 
(Figure 47); SEQ ID NO:66 (Figure 53); SEQ ID NO:34 (Figure 
20); SEQ ID NO:37 (Figure 24); SEQ ID NO:40 (Figure 27); SEQ 
ID NO:43 (Figure 30); SEQ ID NO:46 (Figure 33); SEQ ID NO:76 
(Figure 64) and SEQ ID NO: 49 (Figure 3 6) . In certain 
embodiments, the Env expression cassettes include sequences 
flanking a VI /V2 region but have a deletion in the V1/V2 
region itself, for example, SEQ ID NO:59 (Figure 46); SEQ ID 
NO:61 (Figure 48); SEQ ID NO:67 (Figure 54); SEQ ID NO:75 
(Figure 63); SEQ ID NO:35 (Figure 21); SEQ ID NO:38 (Figure 
25); SEQ ID NO:41 (Figure 28); SEQ ID NO:44 (Figure 31); SEQ 
ID NO:47 (Figure 34) and SEQ ID NO:50 (Figure 37). The Env- 
encoding expression cassettes may also include a mutated 
cleavage site that prevents the cleavage of a gpl40 
polypeptide into a gpl2 0 polypeptide and a gp41 polypeptide, 
for example, SEQ ID NO:57 (Figure 44); SEQ ID NO:61 (Figure 
48); SEQ ID NO:63 (Figure 50); SEQ ID NO:39 (Figure 26); SEQ 
ID NO:40 (Figure 27); SEQ ID NO:41 (Figure 28); SEQ ID NO:42 
(Figure 29); SEQ ID NO:43 (Figure 30); SEQ ID NO:44 (Figure 
31); SEQ ID NO:45 (Figure 32); SEQ ID NO:46 (Figure 33); and 
SEQ ID NO: 47 (Figure 34) . The Env expression cassettes may 
include a gpl60 Env polypeptide or a polypeptide derived 
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from a gpl60 Env polypeptide, for example SEQ ID NO: 64 
(Figure 51); SEQ ID NO:65 (Figure 52); SEQ ID NO:66 (Figure 
53); SEQ ID NO:67 (Figure 54); SEQ ID NO:68 (Figure 55); SEQ 
ID NO:75 (Figure 63); SEQ ID NO:73 (Figure 61); SEQ ID NO:48 
(Figure 35); SEQ ID NO:49 (Figure 36); SEQ ID NO:50 (Figure 
37); SEQ ID NO:76 (Figure 64); and SEQ ID NO:74 (Figure 62) . 
The Env expression cassettes may include a gpl4 0 Env 
polypeptide or a polypeptide derived from a gpl40 Env 
polypeptide, for example SEQ ID NO: 56 (Figure 43); SEQ ID 
NO:57 (Figure 44); SEQ ID NO:58 (Figure 45); SEQ ID NO:59 
(Figure 46); SEQ ID NO:60 (Figure 47); SEQ ID NO:61 (Figure 
48); SEQ ID NO:62 (Figure 49); SEQ ID NO:63 (Figure 50); SEQ 
ID NO:36 (Figure 23); SEQ ID NO:37 (Figure 24); SEQ ID NO:38 
(Figure 25); SEQ ID NO:39 (Figure 26); SEQ ID NO:40 (Figure 
27); SEQ ID NO:41 (Figure 28); SEQ ID NO:42 (Figure 29); SEQ 
ID NO:43 (Figure 30); SEQ ID NO:44 (Figure 31); SEQ ID NO:45 
(Figure 32); SEQ ID NO:46 (Figure 33); and SEQ ID NO:47 
(Figure 34) . The Env expression cassettes may also include 
a gpl20 Env polypeptide or a polypeptide derived from a 
gpl20 Env polypeptide, for example SEQ ID NO:54 (Figure 41); 
and SEQ ID NO:55 (Figure 42); SEQ ID NO:33 (Figure 19); SEQ 
ID NO: 34 (Figure 20) ; and SEQ ID NO: 35 (Figure 21) . The Env 
expression cassettes may include an Env polypeptide lacking 
the amino acids corresponding to residues 128 to about 194, 
relative to strains SF162 or US4, for example, SEQ ID NO: 55 
(Figure 42); SEQ ID NO:62 (Figure 49); SEQ ID NO:63 (Figure 
50); and SEQ ID NO:68 (Figure 55). 

In another aspect, the invention includes a recombinant 
expression system for use in a selected host cell, 
comprising, one or more of the expression cassettes 
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described herein operably linked to control elements 
compatible with expression in the selected host cell. The 
expression cassettes may be included on one or on multiple 
vectors and may use the same or different promoters. 
Exemplary control elements include a transcription promoter 
(e.g., CMV, CMV+intron A, SV40, RSV, HIV-Ltr, MMLV-ltr, and 
metallothionein) , a transcription enhancer element, a 
transcription termination signal, polyadenylation sequences, 
sequences for optimization of initiation of translation, and 
translation termination sequences. 

In another aspect, the invention includes a recombinant 
expression system for use in a selected host cell, 
comprising, any one of the expression cassettes described 
herein operably linked to control elements compatible with 
expression in the selected host cell. Exemplary control 
elements include, but are not limited to, a transcription 
promoter (e.g., CMV, CMV+intron A, SV40, RSV, HIV-LTR, MMLV- 
LTR, and metallothionein), a transcription enhancer element, 
a transcription termination signal, polyadenylation 
sequences, sequences for optimization of initiation of 
translation, and translation termination sequences. 

In yet another aspect, the invention includes a cell 
comprising one or more of the expression cassettes described 
herein operably linked to control elements compatible with 
expression in the cell. The cell can be, for example, a 
mammalian cell (e.g., BHK, VERO, HT1080, 293, RD, COS-7, or 
CHO cells), an insect cell (e.g., Trichoplusia ni (Tn5) or 
Sf 9) , a bacterial cell, a plant cell, a yeast cell, an 
antigen presenting cell (e.g., primary, immortalized or 
tumor-derived lymphoid cells such as macrophages, monocytes, 
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dendritic cells, B-cells, T-cells, stem cells, and 
progenitor cells thereof) . 

In another aspect, the invention includes methods for 
producing a polypeptide including HIV Gag-, prot-, pol-, 
reverse transcriptase, Env- or Tat -containing polypeptide 
sequences, said method comprising, incubating the cells 
comprising one or more the expression cassettes describe 
herein, under conditions for producing said polypeptide. 

In yet another aspect, the invention includes 
compositions for generating an immunological response, 
comprising one or more of the expression cassettes described 
herein. In certain embodiments, the compositions also 
include an adjuvant. 

In a still further aspect, the invention includes 
methods of generating an immune response in a subject, 
comprising introducing a composition comprising one or more 
of the expression cassettes described herein into the 
subject under conditions that are compatible with expression 
of said expression cassette in the subject. In certain 
embodiments, the expression cassette is introduced using a 
gene delivery vector. More than one expression cassette may 
be introduced using one or more gene delivery vectors. 

In yet another aspect, the invention includes a 
purified polynucleotide comprising a polynucleotide sequence 
encoding a polypeptide including an HIV Env polypeptide, 
wherein the polynucleotide sequence encoding said Env 
polypeptide comprises a sequence having at least 90% 
sequence identity to SEQ ID NO: 71 (Figure 58) or SEQ ID 
NO: 72 (Figure 59) ♦ Further exemplary purified 
polynucleotide sequences were presented above. 
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The polynucleotides of the present invention can be 
produced by recombinant techniques, synthetic techniques, or 
combinat ions thereof . 

In another embodiment, the invention includes a method 
for producing a polypeptide including HIV Gag polypeptide 
sequences, where the method comprises incubating any of the 
above cells containing an expression cassette of interest 
under conditions for producing the polypeptide. 

The invention further includes, a method for producing 
virus- like particles (VLPs) where the method comprises 
incubating any of the above-described cells containing an 
expression cassette of interest under conditions for 
producing VLPs . 

In another aspect the invention includes a method for 
producing a composition of virus-like particles (VLPs) 
where, any of the above -de scribed cells containing an 
expression cassette of interest are incubated under 
conditions for producing VLPs, and the VLPs are 
substantially purified to produce a composition of VLPs. 

In a further embodiment of the present invention, 
packaging cell lines are produced using the expression 
cassettes of the present invention. For example, a cell 
line useful for packaging lentivirus vectors comprises 
suitable host cells that have an expression vector 
containing an expression cassette of the present invention 
wherein said polynucleotide sequence is operably linked to 
control elements compatible with expression in the host 
cell. In a preferred embodiment, such host cells may be 
transfected with one or more expression cassettes having a 
polynucleotide sequence that encodes an HIV polymerase 
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polypeptide or polypeptides derived therefrom, for example, 
where the nucleotide sequence encoding said polypeptide 
comprises a sequence having at least 90% sequence identity 
to the sequence presented as SEQ ID NO: 6. Further, the HIV 
polymerase polypeptide may be modified by deletions of 
coding regions corresponding to reverse transcriptase" and 
integrase. Such a polynucleotide sequence may preserve T- 
helper cell and CTL epitopes, for example when used in a 
vaccine application. In addition, the polynucleotide 
sequence may also include other polypeptides. Further, 
polynucleotide sequences encoding additional polypeptides 
whose expression are useful for packaging cell line function 
may also be utilized. 

In another aspect, the present invention includes a 
gene delivery or vaccine vector for use in a subject, where 
the vector is a suitable gene delivery vector for use in the 
subject, and the vector comprises one or more of any of the 
expression cassettes of the present invention where the 
polynucleotide sequences of interest are operably linked to 
control elements compatible with expression in the subject. 
Such gene delivery vectors can be used in a method of DNA 
immunization of a subject, for example, by introducing a 
gene delivery vector into the subject under conditions that 
are compatible with expression of the expression cassette in 
the subject. Gene delivery vectors useful in the practice 
of the present invention include, but are not limited to, 
nonviral vectors, bacterial plasmid vectors, viral vectors, 
particulate carriers (where the vector is coated on a 
polylactide co-glycolide particles, gold or tungsten 
particle, for example, the coated particle can be delivered 
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to a subject cell using a gene gun) , liposome preparations, 
and viral vectors (e.g., vectors derived from alphaviruses , 
pox viruses, and vaccinia viruses, as well as, retroviral 
vectors, including, but not limited to, lentiviral vectors) . 
Alphavirus- derived vectors include, for example, an 
alphavirus cDNA construct, a recombinant alphavirus particle 
preparation and a eukaryotic layered vector initiation 
system. In one embodiment, the subject is a vertebrate, 
preferably a mammal, and in a further embodiment the subject 
is a human. 

The invention further includes a method of generating 
an immune response in a subject, where cells of a subject 
are transfected with any of the above -de scribed gene 
delivery vectors {e.g., alphavirus constructs; alphavirus 
cDNA constructs; eukaryotic layered vector initiation 
systems (see, e.g., U.S. Patent Number 5,814,482 for 
description of suitable eukaryotic layered vector initiation 
systems); alphavirus particle preparations; etc.) under 
conditions that permit the expression of a selected 
polynucleotide and production of a polypeptide of interest 
(i.e., encoded by any expression cassette of the present 
invention) , thereby eliciting an immunological response to 
the polypeptide. Transfection of the cells may be performed 
ex vivo and the transfected cells are reintroduced into the 
subject. Alternately, or in addition, the cells may be 
transfected in vivo in the subject. The immune response may 
be humoral and/or cell -mediated (cellular) . 

Further embodiments of the present invention include 
purified polynucleotides. In one embodiment, the purified 
polynucleotide comprises a polynucleotide sequence having at 
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least 90% sequence identity to the sequence presented as SEQ 
ID NO: 20, and complements thereof. In another embodiment, 
the purified polynucleotide comprises a polynucleotide 
sequence encoding an HIV Gag polypeptide, wherein the 
polynucleotide sequence comprises a sequence having at least 
90% sequence identity to the sequence presented as SEQ ID 
NO:20, and complements thereof. In still another 
embodiment, the purified polynucleotide comprises a 
polynucleotide sequence encoding an HIV Gag polypeptide, 
wherein the polynucleotide sequence comprises a sequence 
having at least 90% sequence identity to the sequence 
presented as SEQ ID NO: 9, and complements thereof. In 
further embodiments the polynucleotide sequence comprises a 
sequence having at least 90% sequence identity to one of the 
following sequences: SEQ ID NO:4, SEQ ID NO: 5, SEQ ID NO: 6, 
SEQ ID NO: 7, and complements thereof. 

The polynucleotides of the present invention can be 
produced by recombinant techniques, synthetic techniques, or 
combinations thereof. 

These and other embodiments of the present invention 
will readily occur to those of ordinary skill in the art in 
view of the disclosure herein. 

Brief Description of the Figures 

Figure 1 shows the locations of the inactivation sites 
for the native HIV-1SF2 Gag protein coding sequence. 

Figure 2 shows the locations of the inactivation sites 
for the native HIV-1SF2 Gag-protease protein coding 
sequence . 



13 



1621.002 

2302-1621 

PATENT 

Figures 3A and 3B show electron micrographs of virus - 
like particles. Figure 3A shows immature p55Gag virus-like 
particles in COS-7 cells transfected with a synthetic HIV- 
1 S F2 ff a 9 construct while Figure 3B shows mature (arrows) and 
immature VLP in cells transfected with a modified HIV-1 SF2 
gagprotease construct (GP2, SEQ ID NO:70). Transfected 
cells were fixed at 24 h {gag) or 48 h {gagprotease) post- 
transfection and subsequently analyzed by electron 
microscopy (magnification at 100,000X). Cells transfected 
with vector alone (pCMVKm2) served as negative control (data 
not shown) . 

Figure 4 presents an image of samples from a series of 
fractions which were elect rophoresed on an 8-16% SDS 
polyacryl amide gel and the resulting bands visualized by 
commassie blue staining. The results show that the native 
p55 Gag virus-like particles (VLPs) banded at a sucrose 
density of range of 1.15 - 1.19 g/ml with the peak at 
approximately 1.17 g/ml. 

Figure 5 presents an image similar to Figure 4 where 
the analysis was performed using Gag VLPs produced by a 
synthetic Gag expression cassette. 

Figure 6 presents a comparison of the total amount of 
purified HIV p55 Gag from several preparations obtained from 
two baculovirus expression cassettes encoding native and 
modified Gag. 

Figure 7 presents an alignment of modified coding 
sequences of the present invention including a synthetic Gag 
expression cassette (SEQ ID NO:4), a synthetic Gag-protease 
expression cassette (SEQ ID N0:5), and a synthetic Gag- 
polymerase expression cassette (SEQ ID NO: 6) . A common 



14 



1621.002 

2302-1621 

PATENT 

region (Gag-common; SEQ ID NO: 9) extends from position 1 to 
position 1262. 

Figure 8 presents an image of wild-type Gag-HCV core 
expression samples from a series of fractions which were 
electrophoresed on an 8-16% SDS polyacrylamide gel and the 
resulting bands visualized by commassie staining. 

Figure 9 shows the results of Western blot analysis of 
the gel shown presented in Figure 8 . 

Figure 10 presents results similar to those shown in 
Figure 9. The results in Figure 10 indicate that the main 
HCV Core-specific reactivity migrates at an approximate 
molecular weight of 72,000 kD, which is in accordance with 
the predicted molecular weight of the Gag-HCV core chimeric 
protein. 

Figures 11A to 11D present a comparison of AT content, 
in percent, of cDNAs corresponding to an unstable human mRNA 
(human IFNy mRNA; 11A) , wild-type HIV Gag native RNA (11B) , 
a stable human mRNA (human GAPDH mRNA; 11C) , and synthetic 
HIV Gag RNA (11D) . 

Figure 12 shows the location of the inactivation sites 
for the native HIV-1SF2 Gag-polymerase sequence. 

Figure 13A presents a vector map of pESN2dhfr. 

Figure 13B presents a map of the pCMVIII vector. 

Figure 14 presents a vector map of pCMV-LINK. 

Figure 15 presents a schematic diagram showing the 
relationships between the following forms of the HIV Env 
polypeptide: gpl60, gpl40, gpl20, and gp41. 

Figure 16 depicts the nucleotide sequence of wild-type 
gpl20 from SF162 (SEQ ID NO:30) . 

Figure 17 depicts the nucleotide sequence of the wild- 
type gpl40 from SF162 (SEQ ID NO: 31) . 
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Figure 18 depicts the nucleotide sequence of the wild- 
type gpl60 from SF162 (SEQ ID NO:32). 

Figure 19 depicts the nucleotide sequence of the 
construct designated gpl20 .modSF162 (SEQ ID NO-.33) . 

Figure 20 depicts the nucleotide sequence of the 
construct designated gpl20 .modSF162 . delV2 (SEQ ID NO-.34). 

Figure 21 depicts the nucleotide sequence of the 
construct designated gpl20 .modSF162 . delVl/V2 (SEQ ID NO:35). 

Figures 22A-H show the percent A-T content over the 
length of the sequences for IFNy (Figures 2C and 2G) ; native 
gpl60 Env US4 and SF162 (Figures 2A and 2E, respectively) ; 
GAPDH (Figures 2D and 2H) ; and the synthetic gpl60 Env for 
US 4 and SF162 (Figures 2B and 2F, respectively) . 

Figure 23 depicts the nucleotide sequence of the 
construct designated gpl40 .modSF162 (SEQ ID NO:36). 

Figure 24 depicts the nucleotide sequence of the 
construct designated gpl40 .modSF162 . delV2 (SEQ ID NO:37). 

Figure 25 depicts the nucleotide sequence of the 
construct designated gpl40 .modSF162 . delVl/V2 (SEQ ID NO:38). 

Figure 26 depicts the nucleotide sequence of the 
construct designated gpl4 0 .mut .modSF162 (SEQ ID NO:39). 

Figure 27 depicts the nucleotide sequence of the 
construct designated gpl40 .mut .modSF162 .delV2 (SEQ ID 
NO:40) . 

Figure 2 8 depicts the nucleotide sequence of the 
construct designated gpl40 .mut .raodSF162 .delVl/V2 (SEQ ID 
N0:41) . 

Figure 29 depicts the nucleotide sequence of the 
construct designated gpl40 .mut 7 .modSF162 (SEQ ID NO: 42) . 

Figure 3 0 depicts the nucleotide sequence of the 
construct designated gpl4 0 .mut7 .modSF162 .delV2 (SEQ ID 
NO: 43) . 
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Figure 31 depicts the nucleotide sequence of the 
construct designated gpl40 .mut7 .modSF162 .delVl/V2 (SEQ ID 
NO: 44) . 

Figure 32 depicts the nucleotide sequence of the 
construct designated gpl40 .mut8 .modSF162 (SEQ ID NO:45). 

Figure 33 depicts the nucleotide sequence of the. 
construct designated gpl40 .mut8 .modSF162 .delV2 (SEQ ID 
NO: 46) . 

Figure 34 depicts the nucleotide sequence of the 
construct designated gpl40 .mut8 .modSF162 ,delVl/V2 (SEQ ID 
NO: 47) . 

Figure 35 depicts the nucleotide sequence of the 
construct designated gpl60 .modSF162 (SEQ ID NO:48) . 

Figure 3 6 depicts the nucleotide sequence of the 
construct designated gpl60 .modSF162 . delV2 (SEQ ID NO: 49) . 

Figure 37 depicts the nucleotide sequence of the 
construct designated gpl60 .modSF162 .delVl/V2 (SEQ ID NO:50). 

Figure 3 8 depicts the nucleotide sequence of the wild- 
type gpl20 from US 4 (SEQ ID NO: 51) . 

Figure 39 depicts the nucleotide sequence of the wild- 
type gpl40 from US4 (SEQ ID NO: 52) . 

Figure 40 depicts the nucleotide sequence of the wild- 
type gpl60 from US4 (SEQ ID NO: 53) . 

Figure 41 depicts the nucleotide sequence of the 
construct designated gpl20.modUS4 (SEQ ID NO:54) . 

Figure 42 depicts the nucleotide sequence of the 
construct designated gpl2 0 .modUS4 .del 128-194 (SEQ ID 
NO: 55) . 

Figure 43 depicts the nucleotide sequence of the 
construct designated gpl40.modUS4 (SEQ ID NO: 56) . 
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Figure 44 depicts the nucleotide sequence of the 
construct designated gpl40 .mut .modUS4 (SEQ ID NO: 57) . 

Figure 45 depicts the nucleotide sequence of the 
construct designated gpl40 .TM.modUS4 (SEQ ID NO: 58) . 

Figure 4 6 depicts the nucleotide sequence of the 
construct designated gpl4 0 .modUS4 . delVl/V2 (SEQ ID NO.:59). 

Figure 47 depicts the nucleotide sequence of the 
construct designated gpl40 .modUS4 . delV2 (SEQ ID NO: 60). 

Figure 48 depicts the nucleotide sequence of the 
construct designated gpl40 .mut .modUS4 .delVl/V2 (SEQ ID 
NO:61) . 

Figure 49 depicts the nucleotide sequence of the 
construct designated gpl40 .modUS4 . del 128-194 (SEQ ID 
NO: 62) . 

Figure 50 depicts the nucleotide sequence of the 
construct designated gpl4 0 .mut .modUS4 . del 128-194 (SEQ ID 

NO: 63) . 

Figure 51 depicts the nucleotide sequence of the 
construct designated gpl60.modUS4 (SEQ ID NO:64) . 

Figure 52 depicts the nucleotide sequence of the 
construct designated gpl60 .modUS4 .delVl (SEQ ID NO: 65) . 

Figure 53 depicts the nucleotide sequence of the 
construct designated gpl6 0 .modUS4 .delV2 (SEQ ID NO:66). 

Figure 54 depicts the nucleotide sequence of the 
construct designated gpl60 .modUS4 .delVl/V2 (SEQ ID NO: 67). 

Figure 55 depicts the nucleotide sequence of the 
construct designated gpl60 .modUS4 .del 128-194 (SEQ ID 
NO: 68) . 

Figure 56 depicts the nucleotide sequence of the common 
region of Env from wild- type US 4 (SEQ ID NO: 69) . 
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Figure 57 depicts the nucleotide sequence of the common 
region of Env from wild- type SF162 (SEQ ID NO: 70) . 

Figure 58 depicts the nucleotide sequence of synthetic 
sequences corresponding to the common region of Env from US 4 
(SEQ ID NO:71) . 

Figure 59 depicts the nucleotide sequence of synthetic 
sequences corresponding to the common region of Env from 
SF162 (SEQ ID NO:72) . 

Figure 60 presents a schematic representation of an Env 
polypeptide purification strategy. 

Figure 61 depicts the nucleotide sequence of the 
bicistronic construct designated gpl60 .modUS4 .Gag.modSF2 
(SEQ ID NO: 73) . 

Figure 62 depicts the nucleotide sequence of the 
bicistronic construct designated gpl60 ,modSF162 .Gag.modSF2 
(SEQ ID NO: 74) . 

Figure 63 depicts the nucleotide sequence of the 
bicistronic construct designated gpl60 ,modUS4 . - 
delVl/V2.Gag.modSF2 (SEQ ID NO:75). 

Figure 64 depicts the nucleotide sequence of the 
bicistronic construct designated 

gpl60.modSF162.delV2.Gag.modSF2 (SEQ ID NO:76). 

Figures 65A-65F show micrographs of 293T cells 
transfected with the following polypeptide encoding 
sequences: Figure 65A, gag.modSF2; Figure 65B, gpl60 .modUS4 ; 
Figure 65C, gpl60 .modUS4 .delVl/V2 ,gag.modSF2 (bicistronic 
Env and Gag); Figures 65D and 65E, gpl60 .modUS4 . delVl/V2 and 
gag.modSF2; and Figure 65F, gpl2 0 ,modSF162 .delV2 and 
gag. modS F2 . 

Figures 66A and 66B present alignments of selected 
modified coding sequences of the present invention including 
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a common region defined for each group of synthetic Env 
expression cassettes. Figure 66A presents alignments of 
modified SF162 sequences. Figure 66B presents alignments of 
modified US 4 sequences. The SEQ ID NOs for these sequences 
are presented in Tables 1A and IB. 

Figure 67 shows the ELISA titers (binding antibodies) 
obtained in two rhesus macaques (H445, lines with solid 
black dots; and J408, lines with open squares) . The y-axis 
is the end-point gpl40 ELISA titers and the x-axis shows 
weeks post -immunization. The dashed lines at 0, 4, and 8 
weeks represent DNA immunizations. The alternating 
dash/dotted line at 27 weeks indicates a DNA plus protein 
boost immuni zat ion . 

Figure 68 (SEQ ID NO: 77) depicts the wild-type 
nucleotide sequence of Gag reverse transcriptase from SF2 . 

Figure 69 (SEQ ID NO: 78) depicts the nucleotide 
sequence of the construct designated GP1. 

Figure 70 (SEQ ID NO: 79) depicts the nucleotide 
sequence of the construct designated GP2 . 

Figure 71 (SEQ ID NO: 80) depicts the nucleotide 
sequence of the construct designated 

FS(+) .protinact .RTopt .YM. FS(+) indicates that there is a 
frameshift in the GagPol coding sequence. 

Figure 72 (SEQ ID NO: 81) depicts the nucleotide 
sequence of the construct designated 
FS(+) .protinact .RTopt .YMWM. 

Figure 73 (SEQ ID NO: 82) depicts the nucleotide 
sequence of the construct designated FS (-) .protmod. RTopt .YM. 
FS(-) indicates that there is no frameshift in the GagPol 
coding sequence. 
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Figure 74 (SEQ ID NO: 83) depicts the nucleotide 
sequence of the construct designated 
FS ( - ) . protmod . RTopt . YMWM . 

Figure 75 (SEQ ID NO: 84) depicts the nucleotide 
sequence of the construct designated FS(-) . protmod. RTopt (+) . 

Figure 76 (SEQ ID NO: 85) depicts the nucleotide 
sequence of wild type Tat from isolate SF162. 

Figure 77 (SEQ ID NO: 86) depicts the amino acid 
sequence of the tat polypeptide. 

Figure 78 (SEQ ID NO: 87) depicts the nucleotide 
sequence of a synthetic Tat construct designated 
Tat.SF162.opt. 

Figure 79 (SEQ ID NO: 88) depicts the nucleotide 
sequence of a synthetic Tat construct designated 
tat . cys22 . sf 162 . opt . The construct encodes a tat 
polypeptide in which the cystein residue at position 22 of 
the wild type Tat polypeptide is replaced by a glycine 
residue . 

Figures 80A to 80E are an alignment of the nucleotide 
sequences of the constructs designated Gag.mod.SF2, GP1 (SEQ 
ID NO:78) , and GP2 (SEQ ID N0:79) . 

Figure 81 (SEQ ID NO: 89) depicts the nucleotide 
sequence of the construct designated tataminoSF162 . opt , 
which encodes the amino terminus of that tat protein. The 
codon encoding the cystein-22 residue is underlined. 

Figure 82 (SEQ ID NO: 90) depicts the amino acid 
sequence of the polypeptide encoded by the construct 
designated tat . cys22 . SF162 . opt (SEQ ID NO:88). 



1621. 002 

2302-1621 

PATENT 

Detailed Description op the Invention 

The practice of the present invention will employ, 
unless otherwise indicated, conventional methods of 
chemistry, biochemistry, molecular biology, immunology and 
pharmacology, within the skill of the art. Such techniques 
are explained fully in the literature. See, e.g., 
Remington' s Pharmaceutical Sciences, 18th Edition (Easton, 
Pennsylvania: Mack Publishing Company, 1990); Methods In 
Enzymology (S. Colowick and N. Kaplan, eds . , Academic Press, 
Inc.); and Handbook of Experimental Immunology, Vols. I -IV 
(D.M. Weir and C.C. Blackwell, eds., 1986, Blackwell 
Scientific Publications); Sambrook, et al . , Molecular 
Cloning: A Laboratory Manual (2nd Edition, 1989) ; Short 
Protocols in Molecular Biology, 4th ed. (Ausubel et al . 
eds., 1999, John Wiley & Sons); Molecular Biology 
Techniques: An Intensive Laboratory Course, (Ream et al . , 
eds., 1998, Academic Press); PCR (Introduction to 
Biotechniques Series), 2nd ed. (Newton & Graham eds., 1997, 
Springer Verlag) . 

All publications, patents and patent applications cited 
herein, whether supra or infra, are hereby incorporated by 
reference in their entirety. 

As used in this specification and the appended claims, 
the singular forms "a," "an" and "the" include plural 
references unless the content clearly dictates otherwise. 
Thus, for example, reference to "an antigen" includes a 
mixture of two or more such agents. 
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1 . Definitions 

In describing the present invention, the following 
terms will be employed, and are intended to be defined as 
indicated below. 

"Synthetic" sequences, as used herein, refers to Env-, 
tat- or Gag-encoding polynucleotides whose expression has 
been optimized as described herein, for example, by codon 
substitution, deletions, replacements and/ or inactivation of 
inhibitory sequences. "Wild-type" or "native" sequences, as 
used herein, refers to polypeptide encoding sequences that 
are essentially as they are found in nature, e.g., Gag 
encoding sequences as found in the isolate HIV-1SF2 or Env 
encoding sequences as found in the isolates HIV-1SF162 or 
HIV1US4 . 

As used herein, the term "virus -like particle" or "VLP" 
refers to a nonrepli eating, viral shell, derived from any 
of several viruses discussed further below. VLPs are 
generally composed of one or more viral proteins, such as, 
but not limited to those proteins referred to as capsid, 
coat, shell, surface and/or envelope proteins, or particle- 
forming polypeptides derived from these proteins. VLPs can 
form spontaneously upon recombinant expression of the 
protein in an appropriate expression system. Methods for 
producing particular VLPs are known in the art and discussed 
more fully below. The presence of VLPs following 
recombinant expression of viral proteins can be detected 
using conventional techniques known in the art, such as by 
electron microscopy, biophysical characterization, and the 
like. See, e.g., Baker et al . , Biophys. J. (1991) 60=1445- 
1456; Hagensee et al . , J. Virol. (1994) 68:4503-4505. For 
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example, VLPs can be isolated by density gradient 
centrifugation and/or identified by characteristic density 
banding (e.g., Example 7). Alternatively, cryoelectron 
microscopy can be performed on vitrified aqueous samples of 
the VLP preparation in question, and images recorded under 
appropriate exposure conditions. 

By "particle- forming polypeptide" derived from a 
particular viral protein is meant a full-length or near 
full-length viral protein, as well as a fragment thereof, or 
a viral protein with internal deletions, which has the 
ability to form VLPs under conditions that favor VLP 
formation. Accordingly, the polypeptide may comprise the 
full-length sequence, fragments, truncated and partial 
sequences, as well as analogs and precursor forms of the 
reference molecule. The term therefore intends deletions, 
additions and substitutions to the sequence, so long as the 
polypeptide retains the ability to form a VLP. Thus, the 
term includes natural variations of the specified 
polypeptide since variations in coat proteins often occur 
between viral isolates. The term also includes deletions, 
additions and substitutions that do not naturally occur in 
the reference protein, so long as the protein retains the 
ability to form a VLP. Preferred substitutions are those 
which are conservative in nature, i.e., those substitutions 
that take place within a family of amino acids that are 
related in their side chains. Specifically, amino acids are 
generally divided into four families: (1) acidic -- 
aspartate and glutamate; (2) basic lysine, arginine, 
histidine; (3) non-polar -- alanine, valine, leucine, 
isoleucine, proline, phenylalanine, methionine, tryptophan; 
and (4) uncharged polar -- glycine, asparagine, glutamine, 
cystine, serine threonine, tyrosine. Phenylalanine, 
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tryptophan, and tyrosine are sometimes classified as 
aromatic amino acids. 

An "antigen" refers to a molecule containing one or 
more epitopes (either linear, conformational or both) that 
will stimulate a host's immune system to make a humoral 
and/or cellular antigen-specific response. The term is used 
interchangeably with the term "immunogen." Normally, a B- 
cell epitope will include at least about 5 amino acids but 
can be as small as 3-4 amino acids. A T-cell epitope, such 
as a CTL epitope, will include at least about 7-9 amino 
acids, and a helper T-cell epitope at least about 12-20 
amino acids. Normally, an epitope will include between 
about 7 and 15 amino acids, such as, 9, 10, 12 or 15 amino 
acids. The term * antigen" denotes both subunit antigens, 
(i.e., antigens which are separate and discrete from a whole 
organism with which the antigen is associated in nature) , as 
well as, killed, attenuated or inactivated bacteria, 
viruses, fungi, parasites or other microbes. Antibodies 
such as anti-idiotype antibodies, or fragments thereof, and 
synthetic peptide mimotopes, which can mimic an antigen or 
antigenic determinant, are also captured under the 
definition of antigen as used herein. Similarly, an 
oligonucleotide or polynucleotide which expresses an antigen 
or antigenic determinant in vivo, such as in gene therapy 
and DNA immunization applications, is also included in the 
definition of antigen herein. 

For purposes of the present invention, antigens can be 
derived from any of several known viruses, bacteria, 
parasites and fungi, as described more fully below. The 
term also intends any of the various tumor antigens. 
Furthermore, for purposes of the present invention, an 
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"antigen" refers to a protein which includes modifications, 
such as deletions, additions and substitutions (generally 
conservative in nature) , to the native sequence, so long as 
the protein maintains the ability to elicit an immunological 
5 response, as defined herein. These modifications may be 

deliberate, as through site-directed mutagenesis, or may be 
accidental, such as through mutations of hosts which produce 
the antigens. 

An "immunological response" to an antigen or 

10 composition is the development in a subject of a humoral 

and/or a cellular immune response to an antigen present in 
the composition of interest. For purposes of the present 
invention, a "humoral immune response" refers to an immune 
response mediated by antibody molecules, while a "cellular 

15 immune response" is one mediated by T- lymphocytes and/or 

other white blood cells. One important aspect of cellular 
immunity involves an antigen-specific response by cytolytic 
T-cells ("CTL"s) . CTLs have specificity for peptide 
antigens that are presented in association with proteins 

20 encoded by the major histocompatibility complex (MHC) and 
expressed on the surfaces of cells. CTLs help induce and 
promote the destruction of intracellular microbes, or the 
lysis of cells infected with such microbes. Another aspect 
of cellular immunity involves an antigen-specific response 

25 by helper T-cells. Helper T-cells act to help stimulate the 
function, and focus the activity of, nonspecific effector 
cells against cells displaying peptide antigens in 
association with MHC molecules on their surface. A 
"cellular immune response 7 ' also refers to the production of 

3 0 cytokines, chemokines and other such molecules produced by 
activated T-cells and/or other white blood cells, including 
those derived from CD4+ and CD8+ T-cells. 
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A composition or vaccine that elicits a cellular immune 
response may serve to sensitize a vertebrate subject by the 
presentation of antigen in association with MHC molecules at 
the cell surface. The cell-mediated immune response is 
5 directed at, or near, cells presenting antigen at their 

surface. In addition, antigen-specific T-lymphocytes can be 
generated to allow for the future protection of an immunized 
host . 

The ability of a particular antigen to stimulate a 

10 cell -mediated immunological response may be determined by a 
number of assays, such as by lymphoprol iteration (lymphocyte 
activation) assays, CTL cytotoxic cell assays, or by 
assaying for T-lymphocytes specific for the antigen in a 
sensitized subject. Such assays are well known in the art. 

15 See, e.g., Erickson et al . , J. Immunol, (1993) 151:4189- 
4199; Doe et al . , Eur. J. Immunol. (1994) 24:2369-2376. 
Recent methods of measuring cell -mediated immune response 
include measurement of intracellular cytokines or cytokine 
secretion by T-cell populations, or by measurement of 

20 epitope specific T-cells (e.g., by the tetramer 

technique) (reviewed by McMichael, A.J., and O'Callaghan, 
C.A., J. Exp. Med. 187 (9) 1367-1371, 1998; Mcheyzer-Williams , 
M.G., et al, Immunol. Rev. 150:5-21, 1996; Lalvani, A., et 
al, J. Exp. Med. 186:859-865, 1997). 

25 Thus, an immunological response as used herein may be 

one which stimulates the production of CTLs, and/or the 
production or activation of helper T- cells. The antigen of 
interest may also elicit an antibody- mediated immune 
response. Hence, an immunological response may include one 

30 or more of the following effects: the production of 

antibodies by B-cells; and/or the activation of suppressor 
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T-cellB and/or y5 T-cells directed specifically to an 
antigen or antigens present in the composition or vaccine of 
interest. These responses may serve to neutralize 
infect ivity, and/or mediate antibody-complement, or antibody 
dependent cell cytotoxicity (ADCC) to provide protection to 
an immunized host. Such responses can be determined using 
standard immunoassays and neutralization assays, well known 
in the art. 

An "immunogenic composition" is a composition that 
comprises an antigenic molecule where administration of the 
composition to a subject results in the development in the 
subject of a humoral and/ or a cellular immune response to 
the antigenic molecule of interest. 

By "subunit vaccine" is meant a vaccine composition 
which includes one or more selected antigens but not all 
antigens, derived from or homologous to, an antigen from a 
pathogen of interest such as from a virus, bacterium, 
parasite or fungus. Such a composition is substantially 
free of intact pathogen cells or pathogenic particles, or 
the lysate of such cells or particles. Thus, a "subunit 
vaccine" can be prepared from at least partially purified 
(preferably substantially purified) immunogenic polypeptides 
from the pathogen, or analogs thereof. The method of 
obtaining an antigen included in the subunit vaccine can 
thus include standard purification techniques, recombinant 
production, or synthetic production. 

"Substantially purified" general refers to isolation of 
a substance (compound, polynucleotide, protein, polypeptide, 
polypeptide composition) such that the substance comprises 
the majority percent of the sample in which it resides. 
Typically in a sample a substantially purified component 
comprises 50%, preferably 80%-85%, more preferably 90-95% of 

28 
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the sample. Techniques for purifying polynucleotides and 
polypeptides of interest are well-known in the art and 
include, for example, ion-exchange chromatography, affinity 
chromatography and sedimentation according to density. 

A "coding sequence" or a sequence which "encodes" a 
selected polypeptide, is a nucleic acid molecule which is 
transcribed (in the case of DNA) and translated (in the case 
of mRNA) into a polypeptide in vivo when placed under the 
control of appropriate regulatory sequences (or "control 
elements"). The boundaries of the coding sequence are 
determined by a start codon at the 5" (amino) terminus and a 
translation stop codon at the 3' (carboxy) terminus. A 
coding sequence can include, but is not limited to, cDNA 
from viral, procaryotic or eucaryotic mRNA, genomic DNA 
sequences from viral or procaryotic DNA, and even synthetic 
DNA sequences. A transcription termination sequence may be 
located 3 • to the coding sequence . 

Typical "control elements", include, but are not 
limited to, transcription promoters, transcription enhancer 
elements, transcription termination signals, polyadenylation 
sequences (located 3' to the translation stop codon), 
sequences for optimization of initiation of translation 
(located 5' to the coding sequence), and translation 
termination sequences, see e.g., McCaughan et al . (1995) 
PNAS USA 92:5431-5435; Kochetov et al (1998) FEBS Letts. 
440:351-355. 

A "nucleic acid" molecule can include, but is not 
limited to, procaryotic sequences, eucaryotic mRNA, cDNA 
from eucaryotic mRNA, genomic DNA sequences from eucaryotic 
(e.g., mammalian) DNA, and even synthetic DNA sequences. 
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The term also captures sequences that include any of the 
known base analogs of DNA and RNA. 

"Operably linked" refers to an arrangement of elements 
wherein the components so described are configured so as to 
perform their usual function. Thus, a given promoter 
operably linked to a coding sequence is capable of effecting 
the expression of the coding sequence when the proper 
enzymes are present. The promoter need not be contiguous 
with the coding sequence, so long as it functions to direct 
the expression thereof. Thus, for example, intervening 
untranslated yet transcribed sequences can be present 
between the promoter sequence and the coding sequence and 
the promoter sequence can still be considered "operably 
linked" to the coding sequence. 

"Recombinant" as used herein to describe a nucleic acid 
molecule means a polynucleotide of genomic, cDNA, 
semisynthetic, or synthetic origin which, by virtue of its 
origin or manipulation: (1) is not associated with all or a 
portion of the polynucleotide with which it is associated in 
nature; and/ or (2) is linked to a polynucleotide other than 
that to which it is linked in nature. The term "re- 
combinant" as used with respect to a protein or polypeptide 
means a polypeptide produced by expression of a recombinant 
polynucleotide. "Recombinant host cells," "host cells," 
"cells," "cell lines," "cell cultures," and other such terms 
denoting procaryotic microorganisms or eucaryotic cell lines 
cultured as unicellular entities, are used interchangeably, 
and refer to cells which can be, or have been, used as 
recipients for recombinant vectors or other transfer DNA, 
and include the progeny of the original cell which has been 
transfected. It is understood that the progeny of a single 
parental cell may not necessarily be completely identical in 
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morphology or in genomic or total DNA complement to the 
original parent, due to accidental or deliberate mutation. 
Progeny of the parental cell which are sufficiently similar 
to the parent to be characterized by the relevant property, 
such as the presence of a nucleotide sequence encoding a 
desired peptide, are included in the progeny intended by 
this definition, and are covered by the above terms. 

Techniques for determining amino acid sequence 
"similarity" are well known in the art. In general, 
"similarity" means the exact amino acid to amino acid 
comparison of two or more polypeptides at the appropriate 
place, where amino acids are identical or possess similar 
chemical and/or physical properties such as charge or 
hydrophobicity. A so-termed "percent similarity" then can 
be determined between the compared polypeptide sequences. 
Techniques for determining nucleic acid and amino acid 
sequence identity also are well known in the art and include 
determining the nucleotide sequence of the mRNA for that 
gene (usually via a cDNA intermediate) and determining the 
amino acid sequence encoded thereby, and comparing this to a 
second amino acid sequence. In general, "identity" refers 
to an exact nucleotide to nucleotide or amino acid to amino 
acid correspondence of two polynucleotides or polypeptide 
sequences, respectively. 

Two or more polynucleotide sequences can be compared by 
determining their "percent identity." Two or more amino 
acid sequences likewise can be compared by determining their 
"percent identity." The percent identity of two sequences, 
whether nucleic acid or peptide sequences, is generally 
described as the number of exact matches between two aligned 
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sequences divided by the length of the shorter sequence and 
multiplied by 100. An approximate alignment for nucleic 
acid sequences is provided by the local homology algorithm 
of Smith and Waterman, Advances in Applied Mathematics 
2:482-489 (1981) . This algorithm can be extended to use 
with peptide sequences using the scoring matrix developed by 
Dayhoff, Atlas of Protein Sequences and Structure, M.O. 
Dayhoff ed., 5 suppl. 3:353-358, National Biomedical 
Research Foundation, Washington, D.C., USA, and normalized 
by Gribskov, Nucl. Acids Res. 14 (6) : 6745- 6763 (1986). An 
implementation of this algorithm for nucleic acid and 
peptide sequences is provided by the Genetics Computer Group 
(Madison, WI) in their BestFit utility application. The 
default parameters for this method are described in the 
Wisconsin Sequence Analysis Package Program Manual, Version 
8 (1995) (available from Genetics Computer Group, Madison, 
WI) . Other equally suitable programs for calculating the 
percent identity or similarity between sequences are 
generally known in the art. 

For example, percent identity of a particular 
nucleotide sequence to a reference sequence can be 
determined using the homology algorithm of Smith and 
Waterman with a default scoring table and a gap penalty of 
six nucleotide positions. Another method of establishing 
percent identity in the context of the present invention is 
to use the MPSRCH package of programs copyrighted by the 
University of Edinburgh, developed by John F. Collins and 
Shane S. Sturrok, and distributed by IntelliGenetics, Inc. 
(Mountain View, CA) . From this suite of packages, the 
Smith-Waterman algorithm can be employed where default 
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parameters are used for the scoring table (for example, gap 
open penalty of 12, gap extension penalty of one, and a gap 
of six) . From the data generated, the "Match" value 
reflects "sequence identity." Other suitable programs for 
calculating the percent identity or similarity between 
sequences are generally known in the art, such as the. 
alignment program BLAST, which can also be used with default 
parameters. For example, BLASTN and BLASTP can be used with 
the following default parameters: genetic code = standard; 
filter = none; strand = both; cutoff = 60; expect = 10; 
Matrix = BLOSUM62 ; Descriptions = 50 sequences; sort by = 
HIGH SCORE; Databases = non- redundant , GenBank + EMBL + DDBJ 
+ PDB + GenBank CDS translations + Swiss protein + Spupdate 
+ PIR. Details of these programs can be found at the 
following internet address: http://www.ncbi.nlm.gov/cgi- 
bin/BLAST. 

One of skill in the art can readily determine the 
proper search parameters to use for a given sequence in the 
above programs. For example, the search parameters may vary 
based on the size of the sequence in question. Thus, for 
example, a representative embodiment of the present 
invention would include an isolated polynucleotide having X 
contiguous nucleotides, wherein (i) the X contiguous 
nucleotides have at least about 50% identity to Y contiguous 
nucleotides derived from any of the sequences described 
herein, (ii) X equals Y, and (iii) X is greater than or 
equal to 6 nucleotides and up to 5000 nucleotides, 
preferably greater than or equal to 8 nucleotides and up to 
5000 nucleotides, more preferably 10-12 nucleotides and up 
to 5000 nucleotides, and even more preferably 15-20 
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nucleotides, up to the number of nucleotides present in the 
full-length sequences described herein (e.g., see the 
Sequence Listing and claims) , including all integer values 
falling within the above -described ranges. 

The synthetic expression cassettes (and purified 
polynucleotides) of the present invention include related 
polynucleotide sequences having about 80% to 100%, greater 
than 80-85%, preferably greater than 90-92%, more preferably 
greater than 95%, and most preferably greater than 98% 
sequence (including all integer values falling within these 
described ranges) identity to the synthetic expression 
cassette sequences disclosed herein (for example, to the 
sequences presented in Tables 1A and IB) when the sequences 
of the present invention are used as the query sequence. 

Two nucleic acid fragments are considered to 
"selectively hybridize" as described herein. The degree of 
sequence identity between two nucleic acid molecules affects 
the efficiency and strength of hybridization events between 
such molecules. A partially identical nucleic acid sequence 
will at least partially inhibit a completely identical 
sequence from hybridizing to a target molecule. Inhibition 
of hybridization of the completely identical sequence can be 
assessed using hybridization assays that are well known in 
the art (e.g., Southern blot, Northern blot, solution 
hybridization, or the like, see Sambrook, et al . , Molecular 
Cloning: A Laboratory Manual, Second Edition, (1989) Cold 
Spring Harbor, N.Y.). Such assays can be conducted using 
varying degrees of selectivity, for example, using 
conditions varying from low to high stringency. If 
conditions of low stringency are employed, the absence of 
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non-specific binding can be assessed using a secondary probe 
that lacks even a partial degree of sequence identity (for 
example, a probe having less than about 30% sequence 
identity with the target molecule) , such that, in the 
absence of non-specific binding events, the secondary probe 
will not hybridize to the target. 

When utilizing a hybridization-based detection system, 
a nucleic acid probe is chosen that is complementary to a 
target nucleic acid sequence, and then by selection of 
appropriate conditions the probe and the target sequence 
"selectively hybridize," or bind, to each other to form a 
hybrid molecule. A nucleic acid molecule that is capable of 
hybridizing selectively to a target sequence under 
"moderately stringent" typically hybridizes under conditions 
that allow detection of a target nucleic acid sequence of at 
least about 10-14 nucleotides in length having at least 
approximately 70% sequence identity with the sequence of the 
selected nucleic acid probe. Stringent hybridization 
conditions typically allow detection of target nucleic acid 
sequences of at least about 10-14 nucleotides in length 
having a sequence identity of greater than about 90-95% with 
the sequence of the selected nucleic acid probe. 
Hybridization conditions useful for probe/target 
hybridization where the probe and target have a specific 
degree of sequence identity, can be determined as is known 
in the art (see, for example, Nucleic A cid Hybridization: — A 
Practical Approach , editors B.D. Hames and S.J. Higgins, 
(1985) Oxford; Washington, DC; IRL Press) . 

With respect to stringency conditions for 
hybridization, it is well known in the art that numerous 
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equivalent conditions can be employed to establish a 
particular stringency by varying, for example, the following 
factors: the length and nature of probe and target 
sequences, base composition of the various sequences, 
concentrations of salts and other hybridization solution 
components, the presence or absence of blocking agents in 
the hybridization solutions (e.g., formamide, dextran 
sulfate, and polyethylene glycol), hybridization reaction 
temperature and time parameters, as well as, varying wash 
conditions. The selection of a particular set of 
hybridization conditions is selected following standard 
methods in the art (see, for example, Sambrook, et al . , 
Molecular Cloning: A Laboratory Manual , Second Edition, 
(1989) Cold Spring Harbor, N.Y.) . 

A first polynucleotide is "derived from" second 
polynucleotide if it has the same or substantially the same 
basepair sequence as a region of the second polynucleotide, 
its cDNA, complements thereof, or if it displays sequence 
identity as described above. 

A first polypeptide is "derived from" a second 
polypeptide if it is (i) encoded by a first polynucleotide 
derived from a second polynucleotide, or (ii) displays 
sequence identity to the second polypeptides as described 
above . 

Generally, a viral polypeptide is "derived from" a 
particular polypeptide of a virus (viral polypeptide) if it 
is (i) encoded by an open reading frame of a polynucleotide 
of that virus (viral polynucleotide) , or (ii) displays 
sequence identity to polypeptides of that virus as described 
above . 
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"Encoded by" refers to a nucleic acid sequence which 
codes for a polypeptide sequence, wherein the polypeptide 
sequence or a portion thereof contains an amino acid 
sequence of at least 3 to 5 amino acids, more preferably at 
least 8 to 10 amino acids, and even more preferably at least 
15 to 20 amino acids from a polypeptide encoded by the 
nucleic acid sequence. Also encompassed are polypeptide 
sequences which are immunologically identifiable with a 
polypeptide encoded by the sequence. 

"Purified polynucleotide" refers to a polynucleotide of 
interest or fragment thereof which is essentially free, 
e.g., contains less than about 50%, preferably less than 
about 70%, and more preferably less than about 90%, of the 
protein with which the polynucleotide is naturally 
associated. Techniques for purifying polynucleotides of 
interest are well-known in the art and include, for example, 
disruption of the cell containing the polynucleotide with a 
chaotropic agent and separation of the polynucleotide (s) and 
proteins by ion-exchange chromatography, affinity 
chromatography and sedimentation according to density. 

By "nucleic acid immunization" is meant the 
introduction of a nucleic acid molecule encoding one or more 
selected antigens into a host cell, for the in vivo 
expression of an antigen, antigens, an epitope, or epitopes. 
The nucleic acid molecule can be introduced directly into a 
recipient subject, such as by injection, inhalation, oral, 
intranasal and mucosal administration, or the like, or can 
be introduced ex vivo, into cells which have been removed 
from the host. In the latter case, the transformed cells 
are reintroduced into the subject where an immune response 
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can be mounted against the antigen encoded by the nucleic 
acid molecule. 

"Gene transfer" or "gene delivery" refers to methods or 
systems for reliably inserting DNA or RNA of interest into a 
host cell. Such methods can result in transient expression 
of non- integrated transferred DNA, extrachromosomal 
replication and expression of transferred replicons {e.g., 
episomes) , or integration of transferred genetic material 
into the genomic DNA of host cells. Gene delivery 
expression vectors include, but are not limited to, vectors 
derived from bacterial plasmid vectors, viral vectors, non- 
viral vectors, alphaviruses, pox viruses and vaccinia 
viruses. When used for immunization, such gene delivery 
expression vectors may be referred to as vaccines or vaccine 
vectors . 

W T lymphocytes" or W T cells" are non-antibody producing 
lymphocytes that constitute a part of the cell -mediated arm 
of the immune system. T cells arise from immature 
lymphocytes that migrate from the bone marrow to the thymus, 
where they undergo a maturation process under the direction 
of thymic hormones. Here, the mature lymphocytes rapidly 
divide increasing to very large numbers. The maturing T 
cells become immunocompetent based on their ability to 
recognize and bind a specific antigen. Activation of 
immunocompetent T cells is triggered when an antigen binds 
to the lymphocyte ' s surface receptors . 

The term " transf ection" is used to refer to the uptake 
of foreign DNA by a cell. A cell has been "transf ected" 
when exogenous DNA has been introduced inside the cell 
membrane. A number of transf ection techniques are generally 
known in the art. See, e.g., Graham et al . (1973) Virology, 
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52:456, Sambrook et al . (1989) Molecular Cloning, a 
laboratory manual, Cold Spring Harbor Laboratories, New 
York, Davis et al . (1986) Basic Methods in Molecular 
Biology, Elsevier, and Chu et al . (1981) Gene 13:197. Such 
techniques can be used to introduce one or more exogenous 
DNA moieties into suitable host cells. The term refers to 
both stable and transient uptake of the genetic material, 
and includes uptake of peptide- or antibody- linked DNAs. 

A "vector" is capable of transferring gene sequences to 
target cells (e.g., bacterial plasmid vectors, viral 
vectors, non-viral vectors, particulate carriers, and 
liposomes). Typically, "vector construct," "expression 
vector," and "gene transfer vector," mean any nucleic acid 
construct capable of directing the expression of a gene of 
interest and which can transfer gene sequences to target 
cells. Thus, the term includes cloning and expression 
vehicles, as well as viral vectors. 

Transfer of a "suicide gene" (e.g., a drug- 
susceptibility gene) to a target cell renders the cell 
sensitive to compounds or compositions that are relatively 
nontoxic to normal cells. Moolten, F.L. (1994) Cancer Gene 
Ther. 1:279-287. Examples of suicide genes are thymidine 
kinase of herpes simplex virus (HSV-tk) , cytochrome P450 
(Manome et al . (1996) Gene Therapy 3: 513-520) , human 
deoxycytidine kinase (Manome et al . (1996) Nature Medicine 
2.(5) : 567-573) and the bacterial enzyme cytosine deaminase 
(Dong et al . (1996) Human Gene Therapy 7: 7 '13 -720) . Cells 
which express these genes are rendered sensitive to the 
effects of the relatively nontoxic prodrugs ganciclovir 
(HSV-tk) , cyclophosphamide (cytochrome P450 2B1) , cytosine 
arabinoside (human deoxycytidine kinase) or 5-f luorocytosine 
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(bacterial cytosine deaminase). Culver et al . (1992) 
Science 256 :1550-1552, Huber et al . (1994) Proc. Natl. Acad. 
Sci. USA 91:8302-8306. 

A "selectable marker" or "reporter marker" refers to a 
nucleotide sequence included in a gene transfer vector that 
has no therapeutic activity, but rather is included to allow 
for simpler preparation, manufacturing, characterization or 
testing of the gene transfer vector. 

A "specific binding agent" refers to a member of a 
specific binding pair of molecules wherein one of the 
molecules specifically binds to the second molecule through 
chemical and/or physical means. One example of a specific 
binding agent is an antibody directed against a selected 
antigen. 

By "subject" is meant any member of the subphylum 
chordata, including, without limitation, humans and other 
primates, including non-human primates such as chimpanzees 
and other apes and monkey species; farm animals such as 
cattle, sheep, pigs, goats and horses; domestic mammals such 
as dogs and cats; laboratory animals including rodents such 
as mice, rats and guinea pigs; birds, including domestic, 
wild and game birds such as chickens, turkeys and other 
gallinaceous birds, ducks, geese, and the like. The term 
does not denote a particular age. Thus, both adult and 
newborn individuals are intended to be covered. The system 
described above is intended for use in any of the above 
vertebrate species, since the immune systems of all of these 
vertebrates operate similarly. 

By "pharmaceutically acceptable" or 
"pharmacologically acceptable" is meant a material which is 
not biologically or otherwise undesirable, i.e., the 
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material may be administered to an individual in a 
formulation or composition without causing any undesirable 
biological effects or interacting in a deleterious manner 
with any of the components of the composition in which it is 
contained. 

By "physiological pH" or a "pH in the physiological 
range" is meant a pH in the range of approximately 7.2 to 
8.0 inclusive, more typically in the range of approximately 
7.2 to 7.6 inclusive. 

As used herein, "treatment" refers to any of (I) the 
prevention of infection or reinfection, as in a traditional 
vaccine, (ii) the reduction or elimination of symptoms, and 
(iii) the substantial or complete elimination of the 
pathogen in question. Treatment may be effected 
prophylactically (prior to infection) or therapeutically 
(following infection) . 

"Lentiviral vector", and "recombinant lentiviral 
vector" are derived from the subset of retroviral vectors 
known as lentiviruses . Lentiviral vectors refer to a 
nucleic acid construct which carries, and within certain 
embodiments, is capable of directing the expression of a 
nucleic acid molecule of interest. The lentiviral vector 
includes at least one transcriptional promoter/enhancer or 
locus defining element (s), or other elements which control 
gene expression by other means such as alternate splicing, 
nuclear RNA export, post-translational modification of 
messenger, or post-transcriptional modification of protein. 
Such vector constructs must also include a packaging signal, 
long terminal repeats (LTRS) or portion thereof, and 
positive and negative strand primer binding sites 
appropriate to the lentiviral vector used (if these are not 
already present in the retroviral vector) . Optionally, the 
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recombinant lentiviral vector may also include a signal 
which directs polyadenylation, selectable markers such as 
Neo, TK, hygromycin, phleomycin, histidinol, or DHFR, as 
well as one or more restriction sites and a translation 
5 termination sequence. By way of example, such vectors 

typically include a 5' LTR, a tRNA binding site, a packaging 
signal, an origin of second strand DNA synthesis, and a 
3' LTR or a portion thereof. 

"Lentiviral vector particle" as utilized within the 
10 present invention refers to a lentivirus which carries at 

least one gene of interest. The retrovirus may also contain 
G a selectable marker. The recombinant lentivirus is capable 

j* of reverse transcribing its genetic material (RNA) into DNA 

^ and incorporating this genetic material into a host cell ! s 

Iji 15 DNA upon infection. Lentiviral vector particles may have a 
fl lentiviral envelope, a non- lentiviral envelope (e.g., an 

s ampho or VSV-G envelope), or a chimeric envelope. 

!!" "Nucleic acid expression vector" or "Expression 

yj cassette" refers to an assembly which is capable of 

2 0 directing the expression of a sequence or gene of interest. 
±Q The nucleic acid expression vector includes a promoter which 

is operably linked to the sequences or gene(s) of interest. 
Other control elements may be present as well. Expression 
cassettes described herein may be contained within a plasmid 
25 construct. In addition to the components of the expression 
cassette, the plasmid construct may also include a bacterial 
origin of replication, one or more selectable markers, a 
signal which allows the plasmid construct to exist as 
single-stranded DNA (e.g., a M13 origin of replication), a 

3 0 multiple cloning site, and a "mammalian" origin of 
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replication (e.g., a SV40 or adenovirus origin of 
replication) . 

"Packaging cell" refers to a cell which contains those 
elements necessary for production of infectious recombinant 
retrovirus (e.g., lentivirus) which are lacking in a 
recombinant retroviral vector. Typically, such packaging 
cells contain one or more expression cassettes which are 
capable of expressing proteins which encode Gag, pol and env 
proteins . 

"Producer cell" or "vector producing cell" refers to a 
cell which contains all elements necessary for production of 
recombinant retroviral vector particles. 

2 . Modes of Carrying Out the Invention 

Before describing the present invention in detail, it 
is to be understood that this invention is not limited to 
particular formulations or process parameters as such may, 
of course, vary. It is also to be understood that the 
terminology used herein is for the purpose of describing 
particular embodiments of the invention only, and is not 
intended to be limiting. 

Although a number of methods and materials similar or 
equivalent to those described herein can be used in the 
practice of the present invention, the preferred materials 
and methods are described herein. 

2 . 1 Synthetic Expression Cassettes 

2.1.1 Modification of HIV-1 Gag Nucleic Acid Coding 

Sequences 

One aspect of the present invention is the generation 
of HIV-1 Gag protein coding sequences, and related 
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sequences, having improved expression relative to the 
corresponding wild-type sequence. An exemplary embodiment 
of the present invention is illustrated herein modifying the 
Gag protein wild-type sequences obtained from the HIV-1SF2 
strain (SEQ ID N0:1; Sanchez -Pescador, R. , et al . , Science 
227(4686): 484-492, 1985; Luciw, P. A., et al . U.S. Patent 
No. 5,156,949, issued October 20, 1992, herein incorporated 
by reference; Luciw, P. A., et al., U.S. Patent No. 
5,688,688, November 18, 1997, herein incorporated by 
reference) . Gag sequence obtained from other HIV variants 
may be manipulated in similar fashion following the 
teachings of the present specification. Such other variants 
include, but are not limited to, Gag protein encoding 
sequences obtained from the isolates HIV IIIb , HIV SF2 , HIV- 
1 SF162 , HIV-1 SF170 , HIV^, HIV^, HIV™, HIV-1 CM235 ,, HIV-l ug4 , 
other HIV-1 strains from diverse subtypes (e.g., subtypes, A 
through G, and 0), HIV- 2 strains and diverse subtypes (e.g., 
HIV-2 UC1 and HIV-2 UC2 ) , and simian immunodeficiency virus 
(SIV) . (See, e.g., Virology, 3rd Edition (W.K. Joklik ed. 
1988) ; Fundamental Virology, 2nd Edition (B.N. Fields and 
D.M. Knipe, eds . 1991); Virology, 3rd Edition (Fields, BN, 
DM Knipe, PM Howley, Editors, 1996, Lippincott -Raven, 
Philadelphia, PA; for a description of these and other 

related viruses) . 

First, the HIV-1 codon usage pattern was modified so 
that the resulting nucleic acid coding sequence was 
comparable to codon usage found in highly expressed human 
genes (Example 1) . The HIV codon usage reflects a high 
content of the nucleotides A or T of the codon-triplet . The 
effect of the HIV-1 codon usage is a high AT content in the 
DNA sequence that results in a decreased translation 
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ability and instability of the mRNA. In comparison, highly 
expressed human codons prefer the nucleotides G or C. The 
Gag coding sequences were modified to be comparable to codon 
usage found in highly expressed human genes. In Figure 11 
5 (Example 1) , the percent A-T content of cDNA sequences 

corresponding to the mRNA for a known unstable mRNA and a 
known stable mRNA are compared to the percent A-T content of 
native HIV-1SF2 Gag cDNA and to the synthetic Gag cDNA 
sequence of the present invention. Experiments performed in 

10 support of the present invention showed that the synthetic 
Gag sequences were capable of higher level of protein 
production (see the Examples) relative to the native Gag 
sequences. The data in Figure 11 suggest that one reason 
for this increased production is increased stability of the 

15 mRNA corresponding to the synthetic Gag coding sequences 
versus the mRNA corresponding to the native Gag coding 
sequences . 

Second, there are inhibitory (or instability) elements 
(INS) located within the coding sequences of the Gag coding 

2 0 sequences (Example 1) . The RRE is a secondary RNA structure 

that interacts with the HIV encoded Rev-protein to overcome 
the expression down -regulating effects of the INS . To 
overcome the post -transcriptional activating mechanisms of 
RRE and Rev, the instability elements were inactivated by 
25 introducing multiple point mutations that did not alter the 
reading frame of the encoded proteins. Figure 1 shows the 
original SF2 Gag sequence, the location of the INS 
sequences, and the modifications made to the INS sequences 
to reduce their effects. The resulting modified coding 

3 0 sequences are presented as a synthetic Gag expression 

cassette (SEQ ID N0:4) . 
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Modification of the Gag polypeptide coding sequences 
resulted in improved expression relative to the wild-type 
coding sequences in a number of mammalian cell lines (as 
well as other types of cell lines, including, but not 
limited to, insect cells) . Further, expression of the 
sequences resulted in production of virus-like particles 
(VLPs) by these cell lines (see below) . Similar Gag 
polypeptide coding sequences can be obtained from a variety 
of isolates (families, sub-types, strains, etc.) including, 
but not limited to such other variants include, but are not 
limited to, Gag polypeptide encoding sequences obtained from 
the isolates HIV IIIb , HIV SF2 , HIV-1 SF162 , HIV-1 SF170 , HIV^, HIV^, 
HIV™, HIV-1 CM235 , HIV-1 US4 , other HIV-1 strains from diverse 
subtypes (e.g. , subtypes, A through G, and 0) , HIV-2 strains 
and diverse subtypes (e.g., HIV-2 UC1 and HIV-2 UC2 ) , and simian 
immunodeficiency virus (SIV) . (See, e.g., Virology, 3rd 
Edition (W.K. Joklik ed. 1988); Fundamental Virology, 2nd 
Edition (B.N. Fields and D.M. Knipe, eds. 1991; Virology, 
3rd Edition (Fields, BN, DM Knipe, PM Howley, Editors, 1996, 
Lippincott -Raven, Philadelphia, PA) . Gag polypeptide 
encoding sequences derived from these variants can be 
optimized and tested for improved expression in mammals by 
following the teachings of the present specification (see 
the Examples, in particular Example 1) . 

2.1.2 Further Modification of Sequences Including HIV-1 Gag 

Nucleic Acid Coding Sequences 

Experiments performed in support of the present 
invention have shown that similar modifications of HIV-1 
Gag-protease, Gag-reverse transcriptase and Gag-polymerase 
sequences also result in improved expression of the 



1621. 002 

2302-1621 

PATENT 



polyproteins, as well as, the production of VLPs formed by 
polypeptides produced from such modified coding sequences. 

For the Gag-protease sequence (wild type, SEQ ID NO: 2; 
modified, SEQ ID NOs:5, 78, 79), the changes in codon usage 
were restricted to the regions upstream of the -1 frameshift 
(Figure 2) . Further, inhibitory (or instability) elements 
(INS) located within the coding sequences of the Gag- 
protease polypeptide coding sequence were altered as well 
(indicated in Figure 2) . Exemplary constructs (which include 
the -1 frameshift) encoding modified Gag-protease sequences 
include those shown in SEQ ID NOs:78 and 79 (Figures 69 and 
70) . These are: GP1 (SEQ ID NO: 78) in which the protease 
region was also codon optimized and INS inactivated and GP2 
(SEQ ID NO: 79), in which the protease region was only 
subjected to INS inactivation . 

For other Gag -containing sequences, for example the 
Gag-polymerase sequence (wild type, SEQ ID NO: 3; modified, 
SEQ ID NO: 6) or Gag-reverse transcriptase (wild type, SEQ ID 
NO:77; modified SEQ ID NOs:80-84), the changes in codon 
usage are similar to those for the Gag-protease 
sequence. Those expression cassettes which contain a 
frameshift in the GagPol coding sequence are designated 
"FS(+)" (SEQ ID NOs:80 and 81, Figures 71 and 72) while the 
designation "FS(-)" (SEQ ID Nos : 82, 83 and 84, Figures 73, 
74 and 75) indicates that there is no frameshift utilized in 
this coding sequence. 

In addition to polyproteins containing HIV-related 
sequences, the various Gag-, Gag-prot, Gag-pol, Gag-reverse 
transcriptase encoding sequences of the present invention 
can be fused to other polypeptides (creating chimeric 
polypeptides) for which an immunogenic response is desired. 
An example of such a chimeric protein is the joining of the 
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improved expression Gag encoding sequences to the Hepatitis 
C Virus (HCV) core protein. In this case, the HCV-core 
encoding sequences were placed in- frame with the HIV-Gag 
encoding sequences, resulting in the Gag/HCV-core encoding 
sequence presented as SEQ ID NO: 7 (wild type sequence 
presented as SEQ ID NO: 8) . 

Further sequences useful in the practice of the present 
invention include, but are not limited to, sequences 
encoding viral epitopes/antigens {including but not limited 
to, HCV antigens (e.g., El, E2; Houghton, M. . , et al . , U.S. 
Patent No. 5,714,596, issued February 3, 1998; Houghton, 
M.., et al., U.S. Patent No. 5,712,088, issued January 27, 
1998; Houghton, M. . , et al . , U.S. Patent No. 5,683,864, . 
issued November 4, 1997; Weiner, A.J., et al . , U.S. Patent 
No. 5,728,520, issued March 17, 1998; Weiner, A.J., et al . , 
U.S. Patent No. 5,766,845, issued June 16, 1998; Weiner, 
A.J., et al., U.S. Patent No. 5,670,152, issued September 
23, 1997; all herein incorporated by reference), HIV 
antigens (e.g., derived from nef, tat, rev, vpu, vif, vpr 
and/or env) ; and sequences encoding tumor antigens/epitopes . 
Additional sequences are described below. Also, variations 
on the orientation of the Gag and other coding sequences, 
relative to each other, are also described below. 

Gag, Gag-protease, Gag-reverse transcriptase and/or 
Gag-polymerase polypeptide coding sequences can be obtained 
from any HIV isolates (different families, subtypes, and 
strains) including but not limited to the isolates HIV IIIb , 
HIV SF2 , HIV SF162 , HIVus4, HIV cm235 , HIV^, HIV LAI , HIV.J (see, 
e.g., Myers et al . Los Alamos Database, Los Alamos National 
Laboratory, Los Alamos, New Mexico (1992); Myers et al . , 
Human jRetroviruses and Aids, 1997, Los Alamos, New Mexico: 
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Los Alamos National Laboratory) . Synthetic expression 
cassettes can be generated using such coding sequences as 
starting material by following the teachings of the present 
specification (e.g., see Example 1). Further, the synthetic 
expression cassettes of the present invention include 
related Gag polypeptide coding sequences having greater than 
75%, preferably greater than 80-85%, more preferably greater 
than 90-95%, and most preferably greater than 98% sequence 
identity (or any integer value within these ranges) to the 
synthetic expression cassette sequences disclosed herein 
(for example, SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 6; and SEQ 
ID NO: 20, the Gag Major Homology Region) . 

2.1.3 Expression of Synthetic Sequences Encoding HIV-1 Gag 

and Related Polypeptides 

Several synthetic Gag-encoding sequences (expression 
cassettes) of the present invention were cloned into a 
number of different expression vectors (Example 1) to 
evaluate levels of expression and production of VLPs. Two 
modified synthetic coding sequences are presented as a 
synthetic Gag expression cassette (SEQ ID NO: 4) and a 
synthetic Gag-protease expression cassette (SEQ ID NOs:78 
and 79) . Other synthetic Gag-encoding proteins are 
presented, for example, as SEQ ID NOs:80 through 84. The 
synthetic DNA fragments for Gag-encoding polypeptides (e.g., 
Gag, Gag-protease, Gag-polymerase, Gag- reverse 
transcriptase) were cloned into expression vectors described 
in Example 1, including, a transient expression vector, CMV- 
promoter-based mammalian vectors, and a shuttle vector for 
use in baculovirus expression systems. Corresponding wild- 
type sequences were cloned into the same vectors. 
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These vectors were then trans fected into a several 
different cell types, including a variety of mammalian cell 
lines, (293, RD, C0S-7, and CHO, cell lines available, for 
example, from the A.T.C.C.). The cell lines were cultured 
under appropriate conditions and the levels of p24 (Gag) 
expression in supernatants were evaluated (Example 2).. The 
results of these assays demonstrated that expression of 
synthetic Gag-encoding sequences were significantly higher 
than corresponding wild-type sequences (Example 2; Table 2) . 

Further, Western Blot analysis showed that cells 
containing the synthetic Gag expression cassette produced 
the expected 55 kD (p55) protein at higher per-cell 
concentrations than cells containing the native expression 
cassette. The Gag p55 protein was seen in both cell lysates 
and supernatants. The levels of production were 
significantly higher in cell supernatants for cells 
transfected with the synthetic Gag expression cassette of 
the present invention. Experiments performed in support of 
the present invention suggest that cells containing the 
synthetic Gag-prot expression cassettes produced the 
expected Gag-prot protein at comparably higher per-cell 
concentrations than cells containing the wild-type 
expression cassette. 

Fractionation of the supernatants from mammalian cells 
transfected with the synthetic Gag expression cassette 
showed that it provides superior production of both p55 
protein and VLPs, relative to the wild- type Gag sequences 
(Examples 6 and 7) . 

Efficient expression of these Gag-containing 
polypeptides in mammalian cell lines provides the following 
benefits: the Gag polypeptides are free of baculovirus 
contaminants; production by established methods approved by 
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the FDA; increased purity; greater yields (relative to 
native coding sequences) ; and a novel method of producing 
the Gag- containing polypeptides in CHO or other mammalian 
cells which is not feasible in the absence of the increased 
expression obtained using the constructs of the present 
invention. Exemplary Mammalian cell lines include, but are 
not limited to, BHK, VERO, HT1080, 293, 293T, RD, COS-7, 
CHO, Jurkat, HUT, SUPT, C8166, MOLT4/clone8 , MT-2, MT-4, H9, 
PM1, CEM, myeloma cells (e.g., SB20 cells) and CEMX174, such 
cell lines are available, for example, from the A.T.C.C.). 

A synthetic Gag expression cassette of the present 
invention also demonstrated high levels of expression and 
VLP production when transfected into insect cells (Example 
7) . Further, in addition to a higher total protein yield, 
the final product from the synthetic p55 -expressed Gag 
consistently contained lower amounts of contaminating 
baculovirus proteins than the final purified product from 
the native p5 5 -expressed Gag. 

Further, synthetic Gag expression cassettes of the 
present invention have also been introduced into yeast 
vectors which were transformed into and efficiently 
expressed by yeast cells {Saccharomyces cereviseaj using 
vectors as described in Rosenberg, S. and Tekamp-Olson, P., 
U.S. Patent No. RE35,749, issued, March 17, 1998, herein 
incorporated by reference) . 

In addition to the mammalian and insect vectors 
described in the Examples, the synthetic expression 
cassettes of the present invention can be incorporated into 
a variety of expression vectors using selected expression 
control elements. Appropriate vectors and control elements 
for any given cell type can be selected by one having 
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ordinary skill in the art in view of the teachings of the 
present specification and information known in the art about 
expression vectors. 

For example, a synthetic Gag expression cassette can be 
inserted into a vector which includes control elements 
operably linked to the desired coding sequence, which allow 
for the expression of the gene in a selected cell-type. For 
example, typical promoters for mammalian cell expression 
include the SV4 0 early promoter, a CMV promoter such as the 
CMV immediate early promoter (a CMV promoter can include 
intron A) , RSV, HIV-LTR, the mouse mammary tumor virus LTR 
promoter (MMLV-LTR) , FIV-LTR, the adenovirus major late 
promoter (Ad MLP) , and the herpes simplex virus promoter, 
among others. Other nonviral promoters, such as a promoter 
derived from the murine metallothionein gene, will also find 
use for mammalian expression. Typically, transcription 
termination and polyadenylation sequences will also be 
present, located 3 1 to the translation stop codon. 
Preferably, a sequence for optimization of initiation of 
translation, located 5' to the coding sequence, is also 
present. Examples of transcription 

terminator/polyadenylation signals include those derived 
from SV40, as described in Sambrook, et al . , supra, as well 
as a bovine growth hormone terminator sequence. Introns, 
containing splice donor and acceptor sites, may also be 
designed into the constructs for use with the present 
invention (Chapman et al . , Nuc. Acids Res. (1991) 19:3979- 
3986) . 

Enhancer elements may also be used herein to increase 
expression levels of the mammalian constructs. Examples 
include the SV40 early gene enhancer, as described in 
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Dijkema et al., EMBO J. (1985) 4:761, the enhancer/promoter 
derived from the long terminal repeat (LTR) of the Rous 
Sarcoma Virus, as described in Gorman et al . , Proc. Natl. 
Acad. Sci. USA (1982b) 79:6777 and elements derived from 
human CMV, as described in Boshart et al . , Cell (1985) 
41:521, such as elements included in the CMV intron A. 
sequence (Chapman et al . , Nuc. Acids Res. (1991) 19:3979- 
3986) . 

The desired synthetic Gag polypeptide encoding 
sequences can be cloned into any number of commercially 
available vectors to generate expression of the polypeptide 
in an appropriate host system. These systems include, but 
are not limited to, the following: baculovirus expression 
{Reilly, P.R., et al . , Baculovirus Expression Vector s : A Laboratory 
Manual (1992); Beames, et al., Biotechniques 11:378 (1991); 
Pharmingen; Clontech, Palo Alto, CA) } , vaccinia expression 
{Earl, P. L., et al . , "Expression of proteins in mammalian 
cells using vaccinia" In Current Protocols in Molecular 
Biology (F. M. Ausubel, et al. Eds.), Greene Publishing 
Associates & Wiley Interscience, New York (1991); Moss, B., 
et al. f U.S. Patent Number 5,135,855, issued 4 August 1992}, 
expression in bacteria {Ausubel, F.M., et al., Current 
Protocols in Molecular Biology , John Wiley and Sons, Inc., Media 
PA; Clontech}, expression in yeast {Rosenberg, S. and 
Tekamp-Olson, P., U.S. Patent No. RE35,749, issued, March 
17, 1998, herein incorporated by reference; Shuster, J.R., 
U.S. Patent No. 5,62 9,2 03, issued May 13, 1997, herein 
incorporated by reference; Gellissen, G. , et al., Antonie 
Van Leeuwenhoek, 62 (1-2) : 79-93 (1992); Romanos, M.A., et 
al., Yeast 8(6) :423-488 (1992); Goeddel, D.V., Methods in 
Enzymology 185 (1990); Guthrie, C, and G.R. Fink, Methods 
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in Enzymology 194 (1991)}, expression in mammalian cells 
{Clontech; Gibco-BRL, Ground Island, NY; e.g., Chinese 
hamster ovary (CHO) cell lines (Haynes, J., et al., Nuc. 
Acid. Res. 11:687-706 (1983); 1983, Lau, Y.F., et al., Mol. 
Cell. Biol. 4:1469-1475 (1984); Kaufman, R. J., "Selection 
and coamplif ication of heterologous genes in mammalian 
cells," in Methods in Enzymology, vol. 185, pp537-566. 
Academic Press, Inc., San Diego CA (1991)}, and expression 
in plant cells {plant cloning vectors, Clontech 
Laboratories, Inc., Palo- Alto, CA, and Pharmacia LKB 
Biotechnology, Inc., Pistcataway, NJ; Hood, E., et al., J. 
Bacteriol. 161:1291-1301 (1986); Nagel , R., et al . , FEMS 
Microbiol. Lett. 67:325 (1990); An, et al., "Binary 
Vectors", and others in Plant Molecular Biology Manual A3:l- 
19 (1988); Miki, B.L.A., et al . , pp. 249-265, and others in 
Plant DNA Infectious Agents (Hohn, T., et al., eds.) 
Springer-Verlag, Wien, Austria, (1987) ; Plant Molecular 
Biology: Essential Techniques, P.G. Jones and J.M. Sutton, 
New York, J. Wiley, 1997; Miglani, Gurbachan Dictionary of 
Plant Genetics and Molecular Biology, New York, Food 
Products Press, 1998; Henry, R. J., Practical Applications 
of Plant Molecular Biology, New York, Chapman & Hall, 1997}. 

Also included in the invention is an expression vector, 
such as the CMV promoter-containing vectors described in 
Example 1, containing coding sequences and expression 
control elements which allow expression of the coding 
regions in a suitable host. The control elements generally 
include a promoter, translation initiation codon, and 
translation and transcription termination sequences, and an 
insertion site for introducing the insert into the vector. 
Translational control elements have been reviewed by M. 



54 



1621.002 

2302-1621 

PATENT 

Kozak (e.g., Kozak, M. , Mamm. Genome 7 (8) : 563-574 , 1996; 
Kozak, M. # Biochimie 76 (9) : 815-821, 1994; Kozak, M., J Cell 
Biol 108 (2) :229-241, 1989; Kozak, M . , and Shatkin, A.J., 
Methods Enzymol 60:360-375, 1979). 

Expression in yeast systems has the advantage of 
commercial production. Recombinant protein production by 
vaccinia and CHO cell line have the advantage of being 
mammalian expression systems. Further, vaccinia virus 
expression has several advantages including the following: 
(i) its wide host range; (ii) faithful post-transcriptional 
modification, processing, folding, transport, secretion, and 
assembly of recombinant proteins; (iii) high level 
expression of relatively soluble recombinant proteins; and 
(iv) a large capacity to accommodate foreign DNA. 

The recombinant ly expressed polypeptides from synthetic 
Gag-encoding expression cassettes are typically isolated 
from lysed cells or culture media. Purification can be 
carried out by methods known in the art including salt 
fractionation, ion exchange chromatography, gel filtration, 
size-exclusion chromatography, size-f ractionation, and 
affinity chromatography. Immunoaf f inity chromatography can 
be employed using antibodies generated based on, for 
example, Gag antigens. 

Advantages of expressing the Gag-containing proteins of 
the present invention using mammalian cells include, but are 
not limited to, the following: well-established protocols 
for scale-up production; the ability to produce VLPs; cell 
lines are suitable to meet good manufacturing process (GMP) 
standards; culture conditions for mammalian cells are known 
in the art. 
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2.1.4 Modification of HIV-1 Env Nucleic Acid Coding 

Sequences 

One aspect of the present invention is the generation 
of HIV-1 Env protein coding sequences, and related 
sequences, having improved expression relative to the 
corresponding wild-type sequence. Exemplary embodiments of 
the present invention are illustrated herein modifying the 
Env protein wild- type sequences obtained from the HIV-1 
subtype B strains HIV-1US4 and HIV-1SF162 (Myers et al . , Los 
Alamos Database, Los Alamos National Laboratory, Los Alamos, 
New Mexico (1992); Myers et al . , Human Retroviruses and 
Aids, 1997, Los Alamos, New Mexico: Los Alamos National 
Laboratory) . Env sequence obtained from other HIV variants 
may be manipulated in similar fashion following the 
teachings of the present specification. Such other variants 
include those described above in Section 2.1.1 and on the 
World Wide Web (Internet) , for example at http: //hiv- 
web.lanl.QOv/cai-bin/hivDB3/public/wdb/ ssampublic and 
http : / /hiv-web . lanl . gov . 

First, the HIV-1 codon usage pattern was modified so 
that the resulting nucleic acid coding sequence was 
comparable to codon usage found in highly expressed human 
genes (Example 1) . The HIV codon usage reflects a high 
content of the nucleotides A or T of the codon-triplet . The 
effect of the HIV-1 codon usage is a high AT content in the 
DNA sequence that results in a decreased translation 
ability and instability of the mRNA. In comparison, highly 
expressed human codons prefer the nucleotides G or C. The 
Env coding sequences were modified to be comparable to codon 
usage found in highly expressed human genes. Experiments 
performed in support of the present invention showed that 
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the synthetic Env sequences were capable of higher level of 
protein production (see the Examples) relative to the native 
Env sequences. One reason for this increased production may 
be increased stability of the mRNA corresponding to the 
synthetic Env coding sequences versus the mRNA corresponding 
to the native Env coding sequences. 

Modification of the Env polypeptide coding sequences 
resulted in improved expression relative to the wild- type 
coding sequences in a number of mammalian cell lines. 
Similar Env polypeptide coding sequences can be obtained 
from a variety of isolates (families, sub-types, etc.). Env 
polypeptide encoding sequences derived from these variants 
can be optimized and tested for improved expression in 
mammals by following the teachings of the present 
specification (see the Examples, in particular Example 2) . 

2.1.5 Further Modification of HIV-1 Env Nucleic Acid Coding 

Sequences 

In addition to proteins containing HIV-related 
sequences, the Env encoding sequences of the present 
invention can be fused to other polypeptides (creating 
chimeric polypeptides) . Also, variations on the orientation 
of the Env and other coding sequences, relative to each 
other, are contemplated. Further, the HIV protein encoding 
cassettes of the present invention can be co-expressed using 
one vector or multiple vectors. In addition, the 
polyproteins can be operably linked to the same or different 
promoters . 

Env polypeptide coding sequences can be obtained from 
any HIV isolates (different families, subtypes, and strains) 
including but not limited to the isolates HIV IIIb , HIV SF2 , 
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HIV us4/ HIV CM235 , HIV SP162 , HIV LAV/ HIV^, HIV^) (see, e.g., Myers 
et al . , Los Alamos Database, Los Alamos National Laboratory, 
Los Alamos, New Mexico (1992); Myers et al . , Human 
Retroviruses and Aids, 1997, Los Alamos, New Mexico: Los 
Alamos National Laboratory) . Synthetic expression cassettes 
can be generated using such coding sequences as starting 
material by following the teachings of the present 
specification (e.g., see Example 1). Further, the synthetic 
expression cassettes (and purified polynucleotides) of the 
present invention include related Env polypeptide coding 
sequences having greater than 90%, preferably greater than 
92%, more preferably greater than 95%, and most preferably 
greater than 98% sequence identity to the synthetic 
expression cassette sequences disclosed herein (for example, 
SEQ ID NOs: 71-72; and/or the sequences presented in Tables 
1A and IB) when the sequences of the present invention are 
used as the query sequence. 

2.1.6 Expression of Synthetic Sequences Encoding HIV- 1 Env 

and Related Polypeptides 

Several synthetic Env-encoding sequences (expression 
cassettes) of the present invention were cloned into a 
number of different expression vectors (Example 1) to 
evaluate levels of expression and production of Env 
polypeptide. A modified synthetic coding sequence is 
presented as synthetic Env expression cassettes (Example 1, 
e.g., Tables 1A and IB). The synthetic DNA fragments for 
Env were cloned into eucaryotic expression vectors described 
in Example 1 and in Section 2.1.3 above, including, a 
transient expression vector and CMV-promoter-based mammalian 
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vectors. Corresponding wild-type sequences were cloned into 
the same vectors . 

These vectors were then transfected into a several 
different cell types, including a variety of mammalian cell 
lines, (293, RD, COS-7, and CHO, cell lines available, for 
example, from the A.T.C.C.). The cell lines were cultured 
under appropriate conditions and the levels of gpl20, gpl40 
and gpl60 Env expression in supernatants were evaluated 
(Example 2) . Env polypeptides include, but are not limited 
to, for example, native gpl60, oligomeric gpl40, monomeric 
gpl20 as well as modified sequences of these polypeptides. 
The results of these assays demonstrated that expression of 
synthetic Env encoding sequences were significantly higher 
than corresponding wild- type sequences (Example 2; Tables 3 
and 4) . 

Further, Western Blot analysis showed that cells 
containing the synthetic Env expression cassette produced 
the expected protein (gpl20, gpl40 or gpl60) at higher per- 
cell concentrations than cells containing the native 
expression cassette. The Env proteins were seen in both 
cell lysates and supernatants. The levels of production 
were significantly higher in cell supernatants for cells 
transfected with the synthetic Env expression cassettes of 
the present invention as compared to wild type. 

Fractionation of the supernatants from mammalian cells 
transfected with the synthetic Env expression cassettes 
showed that it provides superior production of Env proteins, 
relative to the wild- type Env sequences (Examples 2 and 3) . 

Efficient expression of these Env-containing 
polypeptides in mammalian cell lines provides the following 
benefits: the Env polypeptides are free of baculovirus or 
other viral contaminants; production by established methods 
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approved by the FDA; increased purity; greater yields 
(relative to native coding sequences) ; and a novel method of 
producing the Env-containing polypeptides in CHO cells which 
is less feasible in the absence of the increased expression 
5 obtained using the constructs of the present invention. 

Exemplary cell lines (e.g., mammalian, yeast, insect, 
etc.) include those described above in Section 2.1.3 for 
Gag-containing constructs. Further, appropriate vectors and 
control elements (e.g., promoters, enhancers, 
10 polyadenylation sequences, etc.) for any given cell type can 
be selected, as described above in Section 2.1.3, by one 
5k j having ordinary skill in the art in view of the teachings of 

£ the present specification and information known in the art 

J~ about expression vectors. In addition, the recombinantly 

y'i 15 expressed polypeptides from synthetic Env-encoding 
In expression cassettes are typically isolated and purified 

f ; from lysed cells or culture media, as described above for 

hi Gag-encoding expression cassettes. An exemplary 

W purification is described in Example 4 and shown in Figure 

2 20 60. 

2.1.7 Modification of HIV-1 Tat Nucleic Acid Coding 

Sequences 

Another aspect of the present invention is the 
2 5 generation of HIV-1 tat protein coding sequences, and 

related sequences, having improved expression relative to 
the corresponding wild-type sequence. Exemplary embodiments 
of the present invention are illustrated herein modifying 
the tat wild- type nucleotide sequence (SEQ ID NO: 85, Figure 
30 76) obtained from SF162 as described above. Exemplary 

synthetic tat constructs are shown in SEQ ID NO: 87, which 
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depicts a tat construct encoding a full-length tat 
polypeptide from strain SF162; SEQ ID NO: 88, which depicts a 
tat construct encoding a tat polypeptide having the cystein 
residue at position 22 changed; and SEQ ID NO: 89, which 
depicts a tat construct encoding the amino terminal portion 
of a tat polypeptide from strain SF162. The amino portion 
of the tat protein appears to contain many of the epitopes 
that induce an immune response. In addition, further 
modifications include replacement or deletion of the cystein 
residue at position 22, for example with a valine residue, 
an alanine residue or a glycine residue (SEQ ID Nos : 88 and 
89, Figures 79 and 81), see, e.g., Caputo et al . (1996) Gene 
Ther. 3:235. In Figure 81, which depicts a tat construct 
encoding the amino terminal portion of a tat polypeptide, 
the nucleotides (nucleotides 64-66) encoding the cystein 
residues are underlined. The design and construction of 
suitable construct can be readily done using 
the teachings of the present specification. As with Gag, 
pol, prot and Env, tat polypeptide coding sequences can be 
obtained from a variety of isolates (families, sub- types, 
etc . ) . 

Modification of the tat polypeptide coding sequences 
result in improved expression relative to the wild-type 
coding sequences in a number of cell lines (e.g., mammalian, 
yeast, bacterial and insect cells) . Tat polypeptide 
encoding sequences derived from these variants can be 
optimized and tested for improved expression in mammals by 
following the teachings of the present specification (see 
the Examples, in particular Example 2) . 

Various forms of the different embodiments of the 
invention, described herein, may be combined. For example, 
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polynucleotides may be derived from the polynucleotide 
sequences of the present invention, including, but not 
limited to, coding sequences for Gag polypeptides, Env 
polypeptides, polymerase polypeptides, protease 
5 polypeptides, tat polypeptides, and reverse transcriptase 

polypeptides. Further, the polynucleotide coding sequences 
of the present invention may be combined into multi- 
cistronic expression cassettes where typically each coding 
sequence for each polypeptide is preceded by IRES sequences. 

10 

2.2 Production of Virus -like Particles and Use of the Constructs 
of the Present Invention to create Packaging cell lines 

The group -specific antigens (Gag) of human 
immunodeficiency virus type-1 (HIV-1) self -assemble into 

15 noninfectious virus-like particles (VLP) that are released 

from various eucaryotic cells by budding (reviewed by Freed, 
E.O., Virology 251 : 1-15, 1998). The synthetic expression 
cassettes of the present invention provide efficient means 
for the production of HIV-Gag virus-like particles (VLPs) 

20 using a variety of different cell types, including, but not 
limited to, mammalian cells. 

Viral particles can be used as a matrix for the proper 
presentation of an antigen entrapped or associated therewith 
to the immune system of the host. For example, U.S. Patent 

25 No. 4,722,840 describes hybrid particles comprised of a 
particle-forming fragment of a structural protein from a 
virus, such as a particle -forming fragment of hepatitis B 
virus (HBV) surface antigen (HBsAg) , fused to a heterologous 
polypeptide. Tindle et al . , Virology (1994) 200. : 547-557 , 

3 0 describes the production and use of chimeric HBV core 
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antigen particles containing epitopes of human 
papillomavirus (HPV) type 16 E7 transforming protein. 

Adams et al . , Nature (1987) 329:68-70, describes the 
recombinant production of hybrid HIVgpl20:Ty VLPs in yeast 
and Brown et al . , Virology (1994) 198:477-488, the 
production of chimeric proteins consisting of the VP2^ 
protein of human parvovirus B19 and epitopes from human 
herpes simplex virus type 1, as well as mouse hepatitis 
virus A59. Wagner et al., (Virology (1994) 200:162-175, 
Brand et al . , J". Virol. Meth. (1995) 51:153-168; Virology 
(1996) 220 :128-140) and Wolf, et al . , (EP 0 449 116 Al, 
published 2 October 1991; WO 96/30523, published 3 October 
1996) describe the assembly of chimeric HIV-1 p55Gagr 
particles. U.S. Patent No. 5,503,833 describes the use of 
rotavirus VP 6 spheres for encapsulating and delivering 
t her apeut i c agent s . 

2.2.1 VLP Production using the synthetic expression cassettes 

OF THE PRESENT INVENTION 

Experiments performed in support of the present 
invention have demonstrated that the synthetic expression 
cassettes of the present invention provide superior 
production of both protein and VLPs, relative to native 
coding sequences (Examples 7 and 15) . Further, electron 
microscopic evaluation of VLP production (Examples 6 and 15, 
Figures 3A-B and 65A-F) showed that free and budding 
immature virus particles of the expected size were produced 
by cells containing the synthetic expression cassettes. 

Using the synthetic expression cassettes of the present 
invention, rather than native coding sequences, for the 
production of virus-like particles provide several 
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advantages. First, VLPs can be produced in enhanced 
quantity making isolation and purification of the VLPs 
easier. Second, VLPs can be produced in a variety of cell 
types using the synthetic expression cassettes, in 
5 particular, mammalian cell lines can be used for VLP 

production, for example, CHO cells. Production using CHO 
cells provides (i) VLP formation; (ii) correct myristylation 
and budding; (iii) absence of non-mammalian cell 
contaminants (e.g., insect viruses and/or cells); and (iv) 

10 ease of purification. The synthetic expression cassettes of 
the present invention are also useful for enhanced 
expression in cell-types other than mammalian cell lines. 
For example, infection of insect cells with baculovirus 
vectors encoding the synthetic expression cassettes resulted 

15 in higher levels of total protein yield and higher levels of 
VLP production (relative to wild-type coding sequences) . 
Further, the final product from insect cells infected with 
the baculovirus -Gag synthetic expression cassettes 
consistently contained lower amounts of contaminating insect 

2 0 proteins than the final product when wild- type coding 
sequences were used (Examples) . 

VLPs can spontaneously form when the particle- forming 
polypeptide of interest is recombinantly expressed in an 
appropriate host cell. Thus, the VLPs produced using the 

2 5 synthetic expression cassettes of the present invention are 

conveniently prepared using recombinant techniques. As 
discussed below, the Gag polypeptide encoding synthetic 
expression cassettes of the present invention can include 
other polypeptide coding sequences of interest (for example, 

3 0 Env, tat, rev, HIV protease, HIV polymerase, HCV core; see, 

Example 1) . Expression of such synthetic expression 
cassettes yields VLPs comprising the product of the 
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synthetic expression cassette, as well as, the polypeptide 
of interest . 

Once coding sequences for the desired particle-forming 
polypeptides have been isolated or synthesized, they can be 
5 cloned into any suitable vector or replicon for expression. 
Numerous cloning vectors are known to those of skill in the 
art, and the selection of an appropriate cloning vector is a 
matter of choice. See, generally, Ausubel et al, supra or 
Sambrook et al, supra. The vector is then used to transform 

10 an appropriate host cell. Suitable recombinant expression 
systems include, but are not limited to, bacterial, 
mammalian, baculovirus/insect , vaccinia, Semliki Forest 
virus (SFV) , Alphaviruses (such as, Sindbis, Venezuelan 
Equine Encephalitis (VEE) ) , mammalian, yeast and Xenopus 

15 expression systems, well known in the art. Particularly 
preferred expression systems are mammalian cell lines, 
vaccinia, Sindbis, insect and yeast systems. 

For example, a number of mammalian cell lines are known 
in the art and include immortalized cell lines available 

20 from the American Type Culture Collection (A.T.C.C.), such 
as, but not limited to, Chinese hamster ovary (CHO) cells, 
2 93 cells, HeLa cells, baby hamster kidney (BHK) cells, 
mouse myeloma (SB20) , monkey kidney cells (COS) , as well as 
others. Similarly, bacterial hosts such as E. coli, 

25 Bacillus subtilis, and Streptococcus spp. t will find use 

with the present expression constructs. Yeast hosts useful 
in the present invention include inter alia, Saccharomyces 
cerevisiae, Candida albicans, Candida maltosa, Hansenula 
polymorpha , Kluyveromyces fragilis , Kluyveromyces lactis, 

30 Pichia guillerimondii , Pichia pastoris, Schizosaccharomyces 
pombe and Yarrowia lipolytica. Insect cells for use with 
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baculovirus expression vectors include, inter alia, Aedes 
aegypti, Autographa calif ornica, Bombyx mori, Drosophila 
melanogaster , Spodoptera frugiperda, and Trichoplusia ni. 
See, e.g., Summers and Smith, Texas Agricultural Experiment 
5 Station Bulletin No. 1555 (1987) . Fungal hosts include, for 
example, Aspergillus. 

Viral vectors can be used for the production of 
particles in eucaryotic cells, such as those derived from 
the pox family of viruses, including vaccinia virus and 

10 avian poxvirus. Additionally, a vaccinia based 

inf ection/transf ection system, as described in Tomei et al . , 
J. Virol. (1993) 67:4017-4026 and Selby et al . , J. Gen. 
Virol. (1993) 74:1103-1113, will also find use with the 
present invention. In this system, cells are first infected 

15 in vitro with a vaccinia virus recombinant that encodes the 
bacteriophage T7 RNA polymerase. This polymerase displays 
exquisite specificity in that it only transcribes templates 
bearing T7 promoters. Following infection, cells are 
transfected with the DNA of interest, driven by a T7 

20 promoter. The polymerase expressed in the cytoplasm from 
the vaccinia virus recombinant transcribes the transfected 
DNA into RNA which is then translated into protein by the 
host translational machinery. Alternately, T7 can be added 
as a purified protein or enzyme as in the w Progenitor" 

25 system (Studier and Moffatt, J. Mol. Biol. (1986) 189 : 113- 
130) . The method provides for high level, transient, 
cytoplasmic production of large quantities of RNA and its 
translation product (s) . 

Depending on the expression system and host selected, 

30 the VLPS are produced by growing host cells transformed by 
an expression vector under conditions whereby the particle- 
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forming polypeptide is expressed and VLPs can be formed. 
The selection of the appropriate growth conditions is within 
the skill of the art. If the VLPs are formed 
intracellularly, the cells are then disrupted, using 
5 chemical, physical or mechanical means, which lyse the cells 
yet keep the VLPs substantially intact. Such methods^ are 
known to those of skill in the art and are described in, 
e.g., Protein Purification Applications: A Practical 
Approach, (E.L.V. Harris and S. Angal, Eds., 1990). 

10 The particles are then isolated (or substantially 

purified) using methods that preserve the integrity thereof, 
such as, by density gradient centrif ugation, e.g., sucrose 
gradients, PEG-precipitation, pelleting, and the like (see, 
e.g., Kirnbauer et al . J". Virol. (1993) 67:6929-6936), as 

15 well as standard purification techniques including, e.g., 
ion exchange and gel filtration chromatography. 

VLPs produced by cells containing the synthetic 
expression cassettes of the present invention can be used to 
elicit an immune response when administered to a subject. 

20 One advantage of the present invention is that VLPs can be 
produced by mammalian cells carrying the synthetic 
expression cassettes at levels previously not possible. As 
discussed above, the VLPs can comprise a variety of antigens 
in addition to the Gag polypeptides (e.g., Env, tat, Gag- 

25 protease, Gag-polymerase, Gag-HCV-core) . Purified VLPs, 
produced using the synthetic expression cassettes of the 
present invention, can be administered to a vertebrate 
subject, usually in the form of vaccine compositions. 
Combination vaccines may also be used, where such vaccines 

3 0 contain, for example, other subunit proteins derived from 

HIV or other organisms (e.g., env) or gene delivery vaccines 
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encoding such antigens. Administration can take place using 
the VLPs formulated alone or formulated with other antigens. 
Further, the VLPs can be administered prior to, concurrent 
with, or subsequent to, delivery of the synthetic expression 
5 cassettes for DNA immunization (see below) and/or delivery 
of other vaccines. Also, the site of VLP administration may 
be the same or different as other vaccine compositions that 
are being administered. Gene delivery can be accomplished 
by a number of methods including, but are not limited to, 
10 immunization with DNA, alphavirus vectors, pox virus 
vectors, and vaccinia virus vectors. 
'O VLP immune -stimulating (or vaccine) compositions can 

jj* include various excipients, adjuvants, carriers, auxiliary 

a ^ substances, modulating agents, and the like. The immune 

frf 15 stimulating compositions will include an amount of the 
fl VLP/antigen sufficient to mount an immunological response. 

3 An appropriate effective amount can be determined by one of 

^ skill in the art. Such an amount will fall in a relatively 

yj broad range that can be determined through routine trials 

20 and will generally be an amount on the order of about 0.1 jig 
Jj to about 1000 /xg, more preferably about 1 jig to about 3 00 

[xg f of VLP/antigen. 

A carrier is optionally present which is a molecule 
that does not itself induce the production of antibodies 
25 harmful to the individual receiving the composition. 

Suitable carriers are typically large, slowly metabolized 
macromolecules such as proteins, polysaccharides, polylactic 
acids, polyglycollic acids, polymeric amino acids, amino 
acid copolymers, lipid aggregates (such as oil droplets or 
30 liposomes), and inactive virus particles. Examples of 

particulate carriers include those derived from polymethyl 
methacrylate polymers, as well as microparticles derived 
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from poly (lactides) and poly (lactide-co-glycolides) , known 
as PLG. See, e.g., Jeffery et al . , Pharm. Res. (1993) 
10:362-368; McGee JP, et al . , J Microencapsul. 14(2) :197- 
210, 1997; 0 ! HaganDT, et al . , Vaccine 11 (2) : 149-54, 1993. 
5 Such carriers are well known to those of ordinary skill in 
the art. Additionally, these carriers may function as 
immunostimulating agents ("adjuvants"). Furthermore, the 
antigen may be conjugated to a bacterial toxoid, such as 
toxoid from diphtheria, tetanus, cholera, etc., as well as 

10 toxins derived from E. coli. 

Such adjuvants include, but are not limited to: (1) 
aluminum salts (alum) , such as aluminum hydroxide, aluminum 
phosphate, aluminum sulfate, etc.; (2) oil-in-water emulsion 
formulations (with or without other specific 

15 immunostimulating agents such as muramyl peptides (see 
below) or bacterial cell wall components) , such as for 
example (a) MF59 (International Publication No. WO 
90/14837), containing 5% Squalene, 0.5% Tween 80, and 0.5% 
Span 85 (optionally containing various amounts of MTP-PE 

20 (see below) , although not required) formulated into 

submicron particles using a microf luidizer such as Model 
HOY microf luidizer (Microf luidics, Newton, MA) , (b) SAF, 
containing 10% Squalane, 0.4% Tween 80, 5% pluronic -blocked 
polymer L121, and thr-MDP (see below) either microf luidized 

25 into a submicron emulsion or vortexed to generate a larger 

particle size emulsion, and (c) Ribi™ adjuvant system (RAS) , 
(Ribi Immunochem, Hamilton, MT) containing 2% Squalene, 0.2% 
Tween 80, and one or more bacterial cell wall components 
from the group consisting of monophosphorylipid A (MPL) , 

3 0 trehalose dimycolate (TDM) , and cell wall skeleton (CWS) , 
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preferably MPL + CWS (Detox™) ; (3) saponin adjuvants, such 
as Stimulon™ (Cambridge Bioscience, Worcester, MA) may be 
used or particle generated therefrom such as ISCOMs 
(immunostimulating complexes) ; (4) Complete Freunds Adjuvant 
5 (CFA) and Incomplete Freunds Adjuvant (IFA) ; (5) cytokines, 
such as interleukins (IL-1, IL-2, etc.), macrophage colony 
stimulating factor (M-CSF) , tumor necrosis factor (TNF) , 
beta chemokines (MIP, 1-alpha, 1-beta Rantes, etc.); (6) 
detoxified mutants of a bacterial ADP-ribosylating toxin 
10 such as a cholera toxin (CT) , a pertussis toxin (PT) , or an 
n E. coli heat-labile toxin (LT) , particularly LT-K63 (where 

V lysine is substituted for the wild-type amino acid at 

SJ position 63) LT-R72 (where arginine is substituted for the 

7n wild- type amino acid at position 72) , CT-S109 (where serine 

H 15 is substituted for the wild-type amino acid at position 
I' 109), and PT-K9/G129 (where lysine is substituted for the 

^ wild-type amino acid at position 9 and glycine substituted 

W at position 129) (see, e.g., International Publication Nos. 

*2 W093/13202 and W092/19265) ; and (7) other substances that 

C s 20 act as immunostimulating agents to enhance the effectiveness 
of the composition. 

Muramyl peptides include, but are not limited to, N- 
acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP) , N- 
acteyl-normuramyl-L-alanyl-D-isogluatme (nor-MDP) , N- 
25 acetylmuramyl-L-alanyl-D-isogluatminyl-L-alanine-2- (1 1 -2 r - 
dipalmitoyl-sn-glycero-3-huydroxyphosphoryloxy) -ethylamine 
(MTP-PE) , etc. 

Dosage treatment with the VLP composition may be a 
single dose schedule or a multiple dose schedule. A 
3 0 multiple dose schedule is one in which a primary course of 
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vaccination may be with 1-10 separate doses, followed by 
other doses given at subsequent time intervals, chosen to 
maintain and/or reinforce the immune response, for example 
at 1-4 months for a second dose, and if needed, a subsequent 
5 dose(s) after several months. The dosage regimen will also, 
at least in part, be determined by the potency of the 
modality, the vaccine delivery employed, the need of the 
subject and be dependent on the judgment of the 
practitioner . 

10 If prevention of disease is desired (e.g., reduction of 

symptoms, recurrences or of disease progression) , the 
antigen carrying VLPs are generally administered prior to 
primary infection with the pathogen of interest. If 
treatment is desired, e.g., the reduction of symptoms or 

15 recurrences, the VLP compositions are generally administered 
subsequent to primary infection. 

2.2.2 USING THE SYNTHETIC EXPRESSION CASSETTES OF THE PRESENT 

INVENTION TO CREATE PACKAGING CELL LINES 

2 0 A number of viral based systems have been developed for 

use as gene transfer vectors for mammalian host cells. For 
example, retroviruses (in particular, lentiviral vectors) 
provide a convenient platform for gene delivery systems. A 
coding sequence of interest (for example, a sequence useful 
25 for gene therapy applications) can be inserted into a gene 
delivery vector and packaged in retroviral particles using 
techniques known in the art. Recombinant virus can then be 
isolated and delivered to cells of the subject either in 
vivo or ex vivo. A number of retroviral systems have been 

3 0 described, including, for example, the following: (U.S. 

Patent No. 5,219,740; Miller et al . (1989) Biotechniques 
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7:980; Miller, A.D. (1990) Human Gene Therapy 1:5; Scarpa et 
al. (1991) Viroloov 180 :849; Burns et al . (1993) Proc. Natl. 
Acad. Sci. USA 90:8033; Boris-Lawrie et al . (1993) Cur. 
Opin. Genet. Develop. 3:102; GB 2200651; EP 0415731; EP 
5 0345242; WO 89/02468; WO 89/05349; WO 89/09271; WO 90/02806; 
WO 90/07936; WO 90/07936; WO 94/03622; WO 93/25698; WO 
93/25234; WO 93/11230; WO 93/10218; WO 91/02805; in U.S. 
5,219,740; U.S. 4,405,712; U.S. 4,861,719; U.S. 4,980,289 
and U.S. 4,777,127; in U.S. Serial No. 07/800,921; and in 

10 Vile (1993) Cancer Res 53:3860-3864; Vile (1993) Cancer Res 
53:962-967; Ram (1993) Cancer Res 53:83-88; Takamiya (1992) 
J Neurosci Res 33:493-503; Baba (1993) J Neurosurg 79:729- 
735; Mann (1983) Cell 33:153; Cane (1984) Proc Natl Acad Sci 
USA 81; 6349; and Miller (1990) Human Gene Therapy 1. 

15 Sequences useful for gene therapy applications include, 

but are not limited to, the following. Factor VIII cDNA, 
including derivatives and deletions thereof (International 
Publication Nos. WO 96/21035, WO 97/03193, WO 97/03194, WO 
97/03195, and WO 97/03191, all of which are hereby 

20 incorporated by reference). Factor IX cDNA (Kurachi et al . 
(1982) Proc. Natl. Acad. Sci. USA 79:6461-6464) . Factor V 
cDNA can be obtained from pMT2-V (Jenny (1987) Proc. Natl. 
Acad. Sci. USA 84:4846, A.T.C.C. Deposit No. 40515). A 
full-length factor V cDNA, or a B domain deletion or B 

25 domain substitution thereof, can be used. B domain 

deletions of factor V, include those reported by Marquette 
(1995) Blood 86:3026 and Kane (1990) Biochemistry 29:6762. 
Antithrombin III cDNA (Prochownik (1983) J. Biol. Chem. 
258:8389, A.T.C.C. Deposit No. 57224/57225). Protein C 

30 encoding cDNA (Foster (1984) Proc. Natl. Acad. Sci. USA 
81:4766; Beckmann (1985) Nucleic Acids Res. 13:5233) . 
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Prothrombin cDNA can be obtained by restriction enzyme 
digestion of a published vector (Degen (1983) Biochemistry 
22:2087). The endothelial cell surface protein, 
thrombomodulin, is a necessary cofactor for the normal 
5 activation of protein C by thrombin. A soluble recombinant 
form has been described (Parkinson (1990) J. Biol. Chem. 
265:12602; Jackman (1987) Proc. Natl. Acad. Sci. USA 
84:6425; Shirai (1988) J. Biochem. 103:281; Wen (1987) 
Biochemistry 26:4350; Suzuki (1987) EMBO J. 6:1891, A.T.C.C. 

10 Deposit No. 61348, 61349) . 

Many genetic diseases caused by inheritance of 
defective genes result in the failure to produce normal gene 
products, for example, thalassemia, phenylketonuria, Lesch- 
Nyhan syndrome, severe combined immunodeficiency (SCID) , 

15 hemophilia A and B, cystic fibrosis, Duchenne's Muscular 
Dystrophy, inherited emphysema and familial 
hypercholesterolemia (Mulligan et al . (1993) Science 
260 : 926 ; Anderson et al . (1992) Science 256 :808; Friedman et 
al. (1989) Science 244:1275). Although genetic diseases may 

20 result in the absence of a gene product, endocrine 

disorders, such as diabetes and hypopituitarism, are caused 
by the inability of the gene to produce adequate levels of 
the appropriate hormone insulin and human growth hormone 
respectively. 

25 In one aspect, gene therapy employing the constructs 

and methods of the present invention involves the 
introduction of normal recombinant genes into T cells so 
that new or missing proteins are produced by the T cells 
after introduction or reintroduction thereof into a patient. 

3 0 A number of genetic diseases have been selected for 

treatment with gene therapy, including adenine deaminase 
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deficiency, cystic fibrosis, G^-antitrypsin deficiency, 
Gaucher' s syndrome, as well as non-genetic diseases. 

In particular, Gaucher' s syndrome is a genetic disorder 
characterized by a deficiency of the enzyme 
5 glucocerebrosidase . This enzyme deficiency leads to the 
accumulation of glucocerebroside in the lysosomes of all 
cells in the body. For a review see Science 256 : 794 (1992) 
and Scriver et al., The Metabolic Basis of Inherited 
Disease, 6th ed., vol. 2, page 1677). Thus, gene transfer 
10 vectors that express glucocerebrosidase can be constructed 
for use in the treatment of this disorder. Likewise, gene 
'3 transfer vectors encoding lactase can be used in the 

3 treatment of hereditary lactose intolerance, those 

expressing AD can be used for treatment of ADA deficiency, 
y* 15 and gene transfer vectors encoding c^-antitrypsin can be 
fl used to treat (^-antitrypsin deficiency. See Ledley, F.D. 

» (1987) J\ Pediatrics 110 :157-174, Verma, I. (Nov. 1987) 

Scientific American pp. 68-84, and International Publication 
y No. WO 95/27512 entitled "Gene Therapy Treatment for a 

^ 20 Variety of Diseases and Disorders," for a description of 
€i gene therapy treatment of genetic diseases. 

In still further embodiments of the invention, 
nucleotide sequences which can be incorporated into a gene 
transfer vector include, but are not limited to, proteins 
25 associated with enzyme-deficiency disorders, such as the 

cystic fibrosis transmembrane regulator (see, for example, 
U.S. Patent No. 5,240,846 and Larrick et al . (1991) Gene 
Therapy Applications of Molecular Biology, Elsevier, New 
York and adenosine deaminase (ADA) (see U.S. Patent No. 
30 5,399,346); growth factors, or an agonist or antagonist of a 
growth factor (Bandara et al . (1992) DNA and Cell Biology, 



1621.002 

2302-1621 

PATENT 



11:227); one or more tumor suppressor genes such as p53, Rb, 
or C-CAMI (Kleinerman et al . (1995) Cancer Research 
55.: 2831); a molecule that modulates the immune system of an 
organism, such as a HLA molecule (Nabel et al . (1993) Proc. 
5 Natl. Acad. Sci . USA 90:11307); a ribozyme (Larsson et al . 
(1996) Virology 219 :161) ; a peptide nucleic acid (Hirshman 
et al. (1996) J. Invest. Med. 44.:347); an antisense molecule 
(Bordier et al . (1995) Proc. Natl. Acad. Sci. USA 92:9383) 
which can be used to down- regulate the expression or 

10 synthesis of aberrant or foreign proteins, such as HIV 
proteins or a wide variety of oncogenes such as p53 
(Hesketh, The Oncogene Facts Book, Academic Press, New York, 
(1995) ; a biopharmaceutical agent or antisense molecule used 
to treat HIV- infection, such as an inhibitor of p24 

15 (Nakashima et al . (1994) Nucleic Acids Res. 22:5004); or 
reverse-transcriptase (see, Bordier, supra) . 

Other proteins of therapeutic interest can be expressed 
in vivo by gene transfer vectors using the methods of the 
invention. For instance sustained in vivo expression of 

20 tissue factor inhibitory protein (TFPI) is useful for 

treatment of conditions including sepsis and DIC and in 
preventing reper fusion injury. (See International 
Publications Nos . WO 93/24143, WO 93/25230 and WO 96/06637). 
Nucleic acid sequences encoding various forms of TFPI can be 

25 obtained, for example, as described in US Patent Nos. 

4,966,852; 5,106,833; and 5,466,783, and incorporated into 
the gene transfer vectors described herein. 

Erythropoietin (EPO) and leptin can also be expressed 
in vivo from genetically modified T cells according to the 

3 0 methods of the invention. For instance EPO is useful in 

gene therapy treatment of a variety of disorders including 
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anemia (see International Publication No. WO 95/13376 
entitled "Gene Therapy for Treatment of Anemia") . Sustained 
delivery of leptin by the methods of the invention is useful 
in treatment of obesity. See International Publication No. 
5 WO 96/05309 for a description of the leptin gene and the use 
thereof in the treatment of obesity. 

A variety of other disorders can also be treated by the 
methods of the invention. For example, sustained in vivo 
systemic production of apolipoprotein E or apolipoprotein A 

10 from genetically modified T cells can be used for treatment 
of hyperlipidemia (see Breslow et al . (1994) Biotechnology 
12:365) . Sustained production of angiotensin receptor 
inhibitor (Goodfriend et al . (1996) N. Engl. J. Med. 
334 : 1469) can be provided by the methods described herein. 

15 As yet an additional example, the long term in vivo systemic 
production of angiostatin is useful in the treatment of a 
variety of tumors. (See O'Reilly et al . (1996) Nature Med. 
2 :689) . 

In other embodiments, gene transfer vectors can be 

2 0 constructed to encode a cytokine or other immunomodulatory 

molecule. For example, nucleic acid sequences encoding 
native IL-2 and gamma- interferon can be obtained as 
described in US Patent Nos . 4,738,927 and 5,326,859, 
respectively, while useful muteins of these proteins can be 
25 obtained as described in U.S. Patent No. 4,853,332. Nucleic 
acid sequences encoding the short and long forms of mCSF can 
be obtained as described in US Patent Nos. 4,847,201 and 
4,879,227, respectively. In particular aspects of the 
invention, retroviral vectors expressing cytokine or 

3 0 immunomodulatory genes can be produced as described herein 

(for example, employing the packaging cell lines of the 
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present invention) and in International Application No. PCT 
US 94/02951, entitled "Compositions and Methods for Cancer 
Immunotherapy . " 

Examples of suitable immunomodulatory molecules for use 
5 herein include the following: IL-1 and IL-2 (Karupiah et al . 
(1990) J". Immunology 144:290-298, Weber et al . (1987)^ J. 
Exp. Med. 166:1716-1733, Gansbacher et al . (1990) J*. Exp. 
Med. 172:1217-1224, and U.S. Patent No. 4,738,927-); IL-3 and 
IL-4 (Tepper et al . (1989) Cell 57:503-512, Golumbek et al . 

10 (1991) Science 254:713-716, and U.S. Patent No. 5,017,691); 
IL-5 and IL-6 (Brakenhof et al. (1987) J. Immunol. 139 :4116- 
4121, and International Publication No. WO 90/06370); IL-7 
(U.S. Patent No. 4,965,195); IL-8, IL-9, IL-10, IL-11, IL- 
12, and IL-13 (Cytokine Bulletin, Summer 1994); IL-14 and 

15 IL-15; alpha interferon (Finter et al . (1991) Drugs 42 : 749- 
765, U.S. Patent Nos . 4,892,743 and 4,966,843, International 
Publication No. WO 85/02862, Nagata et al . (1980) Nature 
284 :316-320, Pamilletti et al . (1981) Methods in Enz. 
78:387-394, Twu et al . (1989) Proc . Natl. Acad. Sci. USA 

20 86:2046-2050, and Faktor et al . (1990) Oncogene 5:867-872); 
beta-interferon (Seif et al. (1991) J. Virol. 65:664-671); 
gamma -interferons (Radford et al . (1991) The American 
Society of Hepatology 20082015, Watanabe et al . (1989) Proc. 
Natl. Acad. Sci. USA 86 : 9456-9460 , Gansbacher et al . (1990) 

25 Cancer Research 50:7820-7825, Maio et al . (1989) Can. 
Immunol. Immunother. 3_0:34-42, and U.S. Patent Nos. 
4,762,791 and 4,727,138); G-CSF (U.S. Patent Nos. 4,999,291 
and 4,810,643); GM-CSF (International Publication No. WO 
85/04188); tumor necrosis factors (TNFs) (Jayaraman et al. 

30 (1990) J. Immunology 144:942-951) ; CD3 (Krissanen et al . 
(1987) Immunogenetics 26:258-266); ICAM-1 (Altman et al . 
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(1989) Nature 338 :512-514, Simmons et al . (1988) Nature 
331 :624-627) ; I CAM- 2, LFA-1, LFA-3 (Wallner et al . (1987) J. 
Exp. Med. 166 : 923-932) ; MHC class I molecules, MHC class II 
molecules, B7.1-.3, (^-microglobulin (Parnes et al . (1981) 
5 Proc. Natl. Acad. Sci . USA 78:2253-2257) ; chaperones such as 
calnexin; and MHC-linked transporter proteins or analogs 
thereof (Powis et al . (1991) Nature 354 :528-531). 
Immunomodulatory factors may also be agonists, antagonists, 
or ligands for these molecules. For example, soluble forms 
10 of receptors can often behave as antagonists for these types 
of factors, as can mutated forms of the factors themselves. 
I; =;f Nucleic acid molecules that encode the above -described 

JS substances, as well as other nucleic acid molecules that are 

advantageous for use within the present invention, may be 
HI 15 readily obtained from a variety of sources, including, for 
fl example, depositories such as the American Type Culture 

S Collection, or from commercial sources such as British Bio- 

!7 Technology Limited (Cowley, Oxford England) . Representative 

id examples include BBG 12 (containing the GM-CSF gene coding 

2 0 for the mature protein of 127 amino acids) , BBG 6 (which 
S contains sequences encoding gamma interferon), A.T.C.C. 

Deposit No. 39656 (which contains sequences encoding TNF) , 
A.T.C.C. Deposit No. 20663 (which contains sequences 
encoding alpha- interferon) , A.T.C.C. Deposit Nos . 31902, 
25 31902 and 39517 (which contain sequences encoding beta- 

interf eron) , A.T.C.C. Deposit No. 67024 (which contains a 
sequence which encodes Interleukin-lb) , A.T.C.C. Deposit 
Nos. 39405, 39452, 39516, 39626 and 39673 (which contain 
sequences encoding Interleukin-2) , A.T.C.C. Deposit Nos. 
30 59399, 59398, and 67326 (which contain sequences encoding 
Interleukin-3) , A.T.C.C. Deposit No. 57592 (which contains 
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sequences encoding Interleukin-4) , A.T.C.C. Deposit Nos. 
59394 and 59395 (which contain sequences encoding 
Interleukin-5) , and A.T.C.C. Deposit No. 67153 (which 
contains sequences encoding Interleukin-6) . 
5 Plasmids containing cytokine genes or immunomodulatory 

genes (International Publication Nos. WO 94/02951 and WO 
96/21015, both of which are incorporated by reference in 
their entirety) can be digested with appropriate restriction 
enzymes, and DNA fragments containing the particular gene of 
10 interest can be inserted into a gene transfer vector using 

standard molecular biology techniques. {See, e.g., Sambrook 
W et al., supra., or Ausubel et al . (eds) Current Protocols in 

jp Molecular Biology, Greene Publishing and Wiley- 

§ JJ Interscience) . 

IH 15 Exemplary hormones, growth factors and other proteins 

l n which are useful for long term expression are described, for 

a example, in European Publication No. 0437478B1, entitled 

"Cyclodextrin-Peptide Complexes. " Nucleic acid sequences 
W encoding a variety of hormones can be used, including those 

2 0 encoding human growth hormone, insulin, calcitonin, 
d prolactin, follicle stimulating hormone (FSH) , luteinizing 

hormone (LH) , human chorionic gonadotropin (HCG) , and 
thyroid stimulating hormone (TSH) . A variety of different 
forms of 1GF-1 and IGF-2 growth factor polypeptides are also 

2 5 well known the art and can be incorporated into gene 

transfer vectors for long term expression in vivo. See, 
e.g., European Patent No. 0123228B1, published for grant 
September 19, 1993, entitled "Hybrid DNA Synthesis of Mature 
Insulin-like Growth Factors." As an additional example, the 

3 0 long term in vivo expression of different forms of 

fibroblast growth factor can also be effected employing the 
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compositions and methods of invention. See, e.g., U.S. 
Patent Nos. 5,464,774, 5,155,214, and 4,994,559 for a 
description of different fibroblast growth factors. 

Polynucleotide sequences coding for the above -described 
5 molecules can be obtained using recombinant methods, such as 
by screening cDNA and genomic libraries from cells 
expressing the gene, or by .deriving the gene from a vector 
known to include the same. For example, plasmids which 
contain sequences that encode altered cellular products may 

10 be obtained from a depository such as the A.T.C.C., or from 
commercial sources. Plasmids containing the nucleotide 
sequences of interest can be digested with appropriate 
restriction enzymes, and DNA fragments containing the 
nucleotide sequences can be inserted into a gene transfer 

15 vector using standard molecular biology techniques. 

Alternatively, cDNA sequences for use with the present 
invention may be obtained from cells which express or 
contain the sequences, using standard techniques, such as 
phenol extraction and PCR of cDNA or genomic DNA. See, 

20 e.g., Sambrook et al . , supra, for a description of 

techniques used to obtain and isolate DNA. Briefly, mRNA 
from a cell which expresses the gene of interest can be 
reverse transcribed with reverse transcriptase using oligo- 
dT or random primers. The single stranded cDNA may then be 

25 amplified by PCR (see U.S. Patent Nos. 4,683,202, 4,683,195 
and 4,800,159, see also PCR Technology: Principles and 
Applications for DNA Amplification, Erlich (ed.), Stockton 
Press, 1989)) using oligonucleotide primers complementary to 
sequences on either side of desired sequences. 

30 The nucleotide sequence of interest can also be 

produced synthetically, rather than cloned, using a DNA 
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synthesizer (e.g., an Applied Biosystems Model 392 DNA 
Synthesizer, available from ABI, Foster City, California) . 
The nucleotide sequence can be designed with the appropriate 
codons for the expression product desired. The complete 
sequence is assembled from overlapping oligonucleotides 
prepared by standard methods and assembled into a complete 
coding sequence. See, e.g., Edge (1981) Mature 292:756: 
Nambair et al . (1984) Science 223:1299; Jay et al . (1984) J. 
Biol. Chem. 259:6311. 

The synthetic expression cassettes of the present 
invention can be employed in the construction of packaging 
cell lines for use with retroviral vectors. 

One type of retrovirus, the murine leukemia virus, or 
"MLV" , has been widely utilized for gene therapy 
applications (see generally Mann et al . (Cell 33:153, 1993), 
Cane and Mulligan (Proc, Na.t 'I . Acad. Sci . USA 81:6349, 
1984), and Miller et al . , Human Gene 21erapy 1:5-14,1990. 

Lentiviral vectors typically, comprise a 5' lentiviral 
LTR, a tRNA binding site, a packaging signal, a promoter 
operably linked to one or more genes of interest, an origin 
of second strand DNA synthesis and a 3' lentiviral LTR, 
wherein the lentiviral vector contains a nuclear transport 
element. The nuclear transport element may be located 
either upstream (5 1 ) or downstream (3 f ) of a coding sequence 
of interest. Within certain embodiments, the nuclear 
transport element is not RRE. Within one embodiment the 
packaging signal is an extended packaging signal. Within 
other embodiments the promoter is a tissue specific 
promoter, or, alternatively, a promoter such as CMV. Within 
other embodiments, the lentiviral vector further comprises 
an internal ribosome entry site. 
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A wide variety of lentiviruses may be utilized within 
the context of the present invention, including for example 
lentiviruses selected from the group consisting of HIV, HIV 
1, HIV-2, FIV and SIV. 

In one embodiment of the present invention synthetic 
Env and/or Gag -polymerase expression cassettes are provided 
comprising a promoter and a sequence encoding synthetic Gag 
polymerase (SEQ ID NO: 6) and at least one of vpr, vpu, nef 
or vif , wherein the promoter is operably linked to Gag- 
polymerase and vpr, vpu, nef or vif. 

Within yet another aspect of the invention, host cells 
(e.g., packaging cell lines) are provided which contain any 
of the expression cassettes described herein. For example, 
within one aspect packaging cell line are provided 
comprising an expression cassette that comprises a sequence 
encoding synthetic Env and/or Gag -polymerase, and a nuclear 
transport element, wherein the promoter is operably linked 
to the sequence encoding Env and/or Gag -polymerase . 
Packaging cell lines may further comprise a promoter and a 
sequence encoding tat, rev, or an envelope, wherein the 
promoter is operably linked to the sequence encoding tat, 
rev, or, the envelope. The packaging cell line may further 
comprise a sequence encoding any one or more of nef, vif, 
vpu or vpr. 

In one embodiment, the expression cassette (carrying, 
for example, the synthetic Env, synthetic tat and/or 
synthetic Gag-polymerase) is stably integrated. The 
packaging cell line, upon introduction of a lentiviral 
vector, typically produces viral particles. The promoter 
regulating expression of the synthetic expression cassette 
may be inducible. Typically, the packaging cell line, upon 
introduction of a lentiviral vector, produces viral 
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particles that are essentially free of replication competent 
virus . 

Packaging cell lines are provided comprising an 
expression cassette which directs the expression of a 
synthetic Env (or Gag -polymerase) gene, an expression 
cassette which directs the expression of a Gag (or Env) gene 
optimized for expression (e.g., Andre, S., et al . , Journal 
of Virology 72 (2) :1497-1503, 1998; Haas, J., et al . , Current 
Biology 6(3) :315-324, 1996). A lentiviral vector is 
introduced into the packaging cell line to produce a vector 
particle producing cell line. 

As noted above, lentiviral vectors can be designed to 
carry or express a selected gene(s) or sequences of 
interest. Lentiviral vectors may be readily constructed 
from a wide variety of lentiviruses (see RNA Tumor Viruses, 
Second Edition, Cold Spring Harbor Laboratory, 1985) . 
Representative examples of lentiviruses included HIV, HIV-1, 
HIV- 2, FIV and SIV. Such lentiviruses may either be 
obtained from patient isolates, or, more preferably, from 
depositories or collections such as the American Type 
Culture Collection, or isolated from known sources using 
available techniques. 

Portions of the lentiviral gene delivery vectors (or 
vehicles) may be derived from different viruses. For 
example, in a given recombinant lentiviral vector, LTRs may 
be derived from an HIV, a packaging signal from SIV, and an 
origin of second strand synthesis from HrV-2. Lentiviral 
vector constructs may comprise a 5' lentiviral LTR, a tRNA 
binding site, a packaging signal, one or more heterologous 
sequences, an origin of second strand DNA synthesis and a 3' 
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LTR, wherein said lentiviral vector contains a nuclear 
transport element that is not RRE. 

Briefly, Long Terminal Repeats ("LTRs") are subdivided 
into three elements, designated U5, R and U3 . These 
5 elements contain a variety of signals which are responsible 
for the biological activity of a retrovirus, including for 
example, promoter and enhancer elements which are located 
within U3 . LTRs may be readily identified in the provirus 
(integrated DNA form) due to their precise duplication at 

10 either end of the genome. As utilized herein, a 5' LTR 

should be understood to include a 5 ' promoter element and 
sufficient LTR sequence to allow reverse transcription and 
integration of the DNA form of the vector. The 3 r LTR 
should be understood to include a polyadenylation signal, 

15 and sufficient LTR sequence to allow reverse transcription 
and integration of the DNA form of the vector. 

The tRNA binding site and origin of second strand DNA 
synthesis are also important for a retrovirus to be 
biologically active, and may be readily identified by one of 

20 skill in the art. For example, retroviral tRNA binds to a 
tRNA binding site by Watson-Crick base pairing, and is 
carried with the retrovirus genome into a viral particle. 
The tRNA is then utilized as a primer for DNA synthesis by 
reverse transcriptase. The tRNA binding site may be readily 

25 identified based upon its location just downstream from the 
5 1 LTR . Similarly, the origin of second strand DNA synthesis 
is, as its name implies, important for the second strand DNA 
synthesis of a retrovirus. This region, which is also 
referred to as the poly-purine tract, is located just 

3 0 upstream of the 3 1 LTR . 

In addition to a 5 ' and 3' LTR, tRNA binding site, and 
origin of second strand DNA synthesis, recombinant 
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retroviral vector constructs may also comprise a packaging 
signal, as well as one or more genes or coding sequences of 
interest. In addition, the lentiviral vectors have a 
nuclear transport element which, in preferred embodiments is 
5 not RRE . Representative examples of suitable nuclear 

transport elements include the element in Rous sarcoma virus 
(Ogert, et al . , J ViroL 70, 3834-3843, 1996), the element in 
Rous sarcoma virus (Liu & Mertz, Genes & Dev., 9, 1766-1789, 
1995) and the element in the genome of simian retrovirus 
10 type I (Zolotukhin, et al . , J Virol. 68, 7944-7952, 1994). 

Other potential elements include the elements in the histone 
% gene (Kedes, Annu. Rev. Biochem. 48, 837-870, 1970), the a- 

C interferon gene (Nagata et al . , Nature 287, 401-408, 1980), 

ifg th e (3-adrenergic receptor gene (Koilka, et al . , Nature 329, 

m 15 75-79, 1987), and the c-Jun gene (Hattorie, et al . , Proc. 
[f| Natl. Acad. Sci. USA 85, 9148-9152, 1988). 

Recombinant lentiviral vector constructs typically lack 
ry both Gag -polymerase and env coding sequences. Recombinant 

lentiviral vector typically contain less than 20, preferably 
§ 20 15, more preferably 10, and most preferably 8 consecutive 
s " nucleotides found in Gag -polymerase or env genes. One 

advantage of the present invention is that the synthetic 
Gag -polymerase expression cassettes, which can be used to 
construct packaging cell lines for the recombinant 
25 retroviral vector constructs, have little homology to wild- 
type Gag-polymerase sequences and thus considerably reduce 
or eliminate the possibility of homologous recombination 
between the synthetic and wild-type sequences. 

Lentiviral vectors may also include tissue-specific 
3 0 promoters to drive expression of one or more genes or 
sequences of interest. For example, lentiviral vector 
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particles of the invention can contain a liver specific 
promoter to maximize the potential for liver specific 
expression of the exogenous DNA sequence contained in the 
vectors. Preferred liver specific promoters include the 
hepatitis B X-gene promoter and the hepatitis B core protein 
promoter. These liver specific promoters are preferably 
employed with their respective enhancers. The enhancer 
element can be linked at either the 5* or the 3 1 end of the 
nucleic acid encoding the sequences of interest. The 
hepatitis B X gene promoter and its enhancer can be obtained 
from the viral genome as a 332 base pair EcoRV-Ncol DNA 
fragment employing the methods described in Twu, et al . , J 
Virol. 61:3448-3453, 1987. The hepatitis B core protein 
promoter can be obtained from the viral genome as a 584 base 
pair BamHI-Bglll DNA fragment employing the methods 
described in Gerlach,et al., Virol 189 : 59- 66 7 1992. It may 
be necessary to remove the negative regulatory sequence in 
the BamHI-Bglll fragment prior to inserting it. Other liver 
specific promoters include the AFP (alpha fetal protein) 
gene promoter and the albumin gene promoter, as disclosed in 
EP Patent Publication 0 415 731, the -1 antitrypsin gene 
promoter, as disclosed in Rettenger, et al . , Proc. Natl. 
Acad. Sci. 91:1460-1464, 1994, the fibrinogen gene 
promoter, the APO-A1 (Apolipoprotein Al) gene promoter, and 
the promoter genes for liver transference enzymes such as, 
for example, SGOT, SGPT and glutamyle transferase. See also 
PCT Patent Publications WO 90/07936 and WO 91/02805 for a 
description of the use of liver specific promoters in 
lentiviral vector particles. 

Lent i viral vector constructs may be generated such that 
more than one gene of interest is expressed. This may be 
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accomplished through the use of di- or oligo-cistronic 
cassettes (e.g., where the coding regions are separated by 
80 nucleotides or less, see generally Levin et al . , Gene 
108:167-174, 1991), or through the use of Internal Ribosome 
Entry Sites ("IRES"). 

Packaging cell lines suitable for use with the above 
described recombinant retroviral vector constructs may be 
readily prepared given the disclosure provided herein. 
Briefly, the parent cell line from which the packaging cell 
line is derived can be selected from a variety of mammalian 
cell lines, including for example, 293, RD, COS-7, CHO, BHK, 
VERO, HT1080, and myeloma cells. 

After selection of a suitable host cell for the 
generation of a packaging cell line, one or more expression 
cassettes are introduced into the cell line in order to 
complement or supply in trans components of the vector which 
have been deleted. 

Representative examples of suitable expression 
cassettes have been described herein and include synthetic 
Env, tat, Gag, synthetic Gag-protease, synthetic Gag-reverse 
transcriptase and synthetic Gag-polymerase expression 
cassettes, which comprise a promoter and a sequence 
encoding, e.g., Env, tat, or Gag-polymerase and at least one 
of vpr, vpu, nef or vif , wherein the promoter is operably 
linked to Env, tat or Gag-polymerase and vpr, vpu, nef or 
vif. As described above, optimized Env, Gag and/or tat 
coding sequences may also be utilized in various 
combinations in the generation of packaging cell lines. 

Utilizing the above-described expression cassettes, a 
wide variety of packaging cell lines can be generated. For 
example, within one aspect packaging cell line are provided 
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comprising an expression cassette that comprises a sequence 
encoding synthetic HIV (e.g., Gag, Env, tat, Gag -polymerase, 
Gag-reverse transcriptase or Gag-protease) polypeptide, and 
a nuclear transport element, wherein the promoter is 
operably linked to the sequence encoding the HIV 
polypeptide. Within other aspects, packaging cell lines are 
provided comprising a promoter and a sequence encoding Gag, 
tat, rev, or an envelope (e.g., HIV env), wherein the 
promoter is operably linked to the sequence encoding Gag, 
tat, rev, or, the envelope. Within further embodiments, the 
packaging cell line may comprise a sequence encoding any one 
or more of nef , vif, vpu or vpr. For example, the packaging 
cell line may contain only nef, vif, vpu, or vpr alone, nef 
and vif, nef and vpu, nef and vpr, vif and vpu, vif and vpr, 
vpu and vpr, nef vif and vpu, nef vif and vpr, nef vpu and 
vpr, wir vpu and vpr, or, all four of nef vif vpu and vpr. 

In one embodiment, the expression cassette is stably 
integrated. Within another embodiment, the packaging cell 
line, upon introduction of a lentiviral vector, produces 
particles. Within further embodiments the promoter is 
inducible. Within certain preferred embodiments of the 
invention, the packaging cell line, upon introduction of a 
lentiviral vector, produces particles that are free of 
replication competent virus. 

The synthetic cassettes containing optimized coding 
sequences are transfected into a selected cell line. 
Transfected cells are selected that (i) carry, typically, 
integrated, stable copies of the Gag, Pol, and Env coding 
sequences, and (ii) are expressing acceptable levels of 
these polypeptides (expression can be evaluated by methods 
known in the prior art, e.g., see Examples 1-4). The 
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ability of the cell line to produce VLPs may also be 
verified (Examples 6, 7 and 15) . 

A sequence of interest is constructed into a suitable 
viral vector as discussed above. This defective virus is 
then transfected into the packaging cell line. The 
packaging cell line provides the viral functions necessary 
for producing virus-like particles into which the defective 
viral genome, containing the sequence of interest, are 
packaged. These VLPs are then isolated and can be used, for 
example, in gene delivery or gene therapy. 

Further, such packaging cell lines can also be used to 
produce VLPs alone, which can, for example, be used as 
adjuvants for administration with other antigens or in 
vaccine compositions. Also, co-expression of a selected 
sequence of interest encoding a polypeptide (for example, an 
antigen) in the packaging cell line can also result in the 
entrapment and/or association of the selected polypeptide 
in/with the VLPs. 

2 . 3 DNA Immunization and Gene Delivery 

A variety of polypeptide antigens can be used in the 
practice of the present invention. Polypeptide antigens can 
be included in DNA immunization constructs containing, for 
example, any of the synthetic expression cassettes described 
herein fused in- frame to a coding sequence for the 
polypeptide antigen, where expression of the construct 
results in VLPs presenting the antigen of interest. 
Antigens can be derived from a wide variety of viruses, 
bacteria, fungi, plants, protozoans and other parasites. 
For example, the present invention will find use for 
stimulating an immune response against a wide variety of 
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proteins from the herpesvirus family, including proteins 
derived from herpes simplex virus (HSV) types 1 and 2, such 
as HSV-1 and HSV-2 gB, gD, gH, VP16 and VP22; antigens 
derived from varicella zoster virus (VZV) , Epstein-Barr 
5 virus (EBV) and cytomegalovirus (CMV) including CMV gB and 
gH; and antigens derived from other human herpesviruses such 
as HHV6 and HHV7 . (See, e.g. Chee et al., Cytomegaloviruses 
(J.K. McDougall, ed. , Springer-Verlag 1990) pp. 125-169, for 
a review of the protein coding content of cytomegalovirus; 
10 McGeoch et al . , J. Gen. Virol. (1988) 69:1531-1574, for a 
discussion of the various HSV-1 encoded proteins; U.S. 
% Patent No. 5,171,568 for a discussion of HSV-1 and HSV-2 qB 

4- and gD proteins and the genes encoding therefore; Baer et 

|5 al., Nature (1984) 310:207-211, for the identification of 

pi 15 protein coding sequences in an EBV genome; and Davison and 
in Scott, J. Gen. Virol. (1986) 67:1759-1816, for a review of 

: vzv.) 

f|| Additionally, immune responses to antigens from the 

5[ hepatitis family of viruses, including hepatitis A virus 

£ 20 (HAV) , hepatitis B virus (HBV) , hepatitis C virus (HCV) , the 
** delta hepatitis virus (HDV) , hepatitis E virus (HEV) , and 

hepatitis G virus, can also be stimulated using the 
constructs of the present invention. By way of example, the 
HCV genome encodes several viral proteins, including El 
25 (also known as E) and E2 (also known as E2/NSI) , which will 
find use with the present invention (see, Houghton et al . 
Hepatology (1991) 14:381-388, for a discussion of HCV 
proteins, including El and E2) . The 5-antigen from HDV can 
also be used (see, e.g., U.S. Patent No. 5,389,528, for a 
30 description of the 5-antigen) . 
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Similarly, influenza virus is another example of a 
virus for which the present invention will be particularly 
useful . Specifically, the envelope glycoproteins HA and NA 
of influenza A are of particular interest for generating an 
5 immune response. Numerous HA subtypes of influenza A have 
been identified (Kawaoka et al . , Virology (1990) 179 :759- 
767; Webster et al . "Antigenic variation among type A 
influenza viruses," p. 127-168. In: P. Palese and D.W. 
Kingsbury (ed.), Genetics of influenza viruses. Springer- 
10 Verlag, New York) . 

Other antigens of particular interest to be used in the 
S3 practice of the present invention include antigens and 

2 polypeptides derived therefrom from human papillomavirus 

(HPV) , such as one or more of the various early proteins 
fr. 15 including E6 and E7; tick-borne encephalitis viruses; and 
f; HIV-l (also known as HTLV-III, LAV, ARV, etc.), including, 

s but not limited to, antigens such as gpl2 0, gp41, gpl60, Gag 

^ and pol from a variety of isolates including, but not 

y limited to, HIV IIIb , HIV SF2 , HIV-1 SF162 , HIV-1 SF170 , HIV LAV , HIV LAI , 

y 20 HIVj^, HIV-1 CM235 / , HIV-1 US4 , other HIV-l strains from diverse 
h|J subtypes (e.g. , subtypes, A through G, and O) , HIV- 2 strains 

and diverse subtypes (e.g., HIV-2 UC1 and HIV-2 UC2 ) . See, 
e -9-/ Myers, et al . , Los Alamos Database, Los Alamos 
National Laboratory, Los Alamos, New Mexico; Myers, et al . , 
25 Human Retroviruses and Aids, 1990, Los Alamos, New Mexico: 
Los Alamos National Laboratory. 

Proteins derived from other viruses will also find use 
in the claimed methods, such as without limitation, proteins 
from members of the families Picornaviridae (e.g., 
30 polioviruses , etc.); Caliciviridae; Togaviridae (e.g., 
rubella virus, dengue virus, etc.); Flaviviridae; 
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Coronaviridae ; Reoviridae ; Birnaviridae ; Rhabodoviridae 
(e.g., rabies virus, etc.); Filoviridae; Paramyxoviridae 
(e.g., mumps virus, measles virus, respiratory syncytial 
virus, etc.); Orthomyxoviridae (e.g., influenza virus types 
5 A, B and C, etc.); Bunyaviridae ; Arenaviridae; Retroviradae, 
e.g., HTLV-I; HTLV-II; HIV-1; HIV- 2; simian immunodeficiency 
virus (SIV) among others. See, e.g. Virology, 3rd Edition 
(W.K. Joklik ed. 1988); Fundamental Virology, 2nd Edition 
(B.N. Fields and D.M. Knipe, eds. 1991; Virology, 3rd 
10 Edition (Fields, BN, DM Knipe, PM Howley, Editors, 1996, 
Lippincott- Raven, Philadelphia, PA) for a description of 
these and other viruses . 
,5 Particularly preferred bacterial antigens are derived 

,J from organisms that cause diphtheria, tetanus, pertussis, 

If- 15 meningitis, and other pathogenic states, including, without 
gfs limitation, antigens derived from Corynebacterium 

J diphtheriae, Clostridium tetani, Bordetella pertusis, 

ffi Neisseria meningitidis, including serotypes Meningococcus A, 

J B, C, Y and WI35 (MenA, B, C, Y and WI35) , Haemophilus 

20 influenza type B (Hib) , and Helicobacter pylori. Examples 
^ of parasitic antigens include those derived from organisms 

causing malaria, tuberculosis, and Lyme disease. 

Furthermore, the methods described herein provide means 
for treating a variety of malignant cancers. For example, 
25 the system of the present invention can be used to enhance 
both humoral and cell -mediated immune responses to 
particular proteins specific to a cancer in question, such 
as an activated oncogene, a fetal antigen, or an activation 
marker. Such tumor antigens include any of the various 
30 MAGEs (melanoma associated antigen E) , including MAGE 1, 2, 
3, 4, etc. (Boon, T. Scientific American (March 1993) :82- 
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89) ; any of the various tyrosinases; MART 1 (melanoma 
antigen recognized by T cells), mutant ras; mutant p53; p97 
melanoma antigen; CEA (carcinoembryonic antigen) , among 
others . 

DNA immunization using synthetic expression cassettes 
of the present invention has been demonstrated to be 
efficacious (Examples 8 and 10-12). Animals were immunized 
with both the synthetic expression cassette and the wild 
type expression cassette. The results of the immunizations 
with plasmid-DNAs showed that the synthetic expression 
cassettes provide a clear improvement of immunogenicity 
relative to the native expression cassettes. Also, the 
second boost immunization induced a secondary immune 
response, for example after two to eight weeks. Further, 
the results of CTL assays showed increased potency of 
synthetic expression cassettes for induction of cytotoxic T- 
lymphocyte (CTL) responses by DNA immunization. 

It is readily apparent that the subject invention can 
be used to mount an immune response to a wide variety of 
antigens and hence to treat or prevent a large number of 
diseases . 

2.3*1 Delivery of the synthetic expression cassettes of the 

present invention 

Polynucleotide sequences coding for the above-described 
molecules can be obtained using recombinant methods, such as 
by screening cDNA and genomic libraries from cells 
expressing the gene, or by deriving the gene from a vector 
known to include the same. The sequences can be analyzed by 
conventional sequencing techniques. Furthermore, the 
desired gene can be isolated directly from cells and tissues 
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containing the same, using standard techniques, such as 
phenol extraction and PCR of cDNA or genomic DNA. See, 
e.g., Sambrook et al . , supra, for a description of 
techniques used to obtain, isolate and sequence DNA. Once 
the sequence is known, the gene of interest can also be 
produced synthetically, rather than cloned. The nucleotide 
sequence can be designed with the appropriate codons for the 
particular amino acid sequence desired. In general, one 
will select preferred codons for the intended host in which 
the sequence will be expressed. The complete sequence is 
assembled from overlapping oligonucleotides prepared by 
standard methods and assembled into a complete coding 
sequence. See, e.g., Edge, Nature (1981) 292:756; Nambair 
et al., Science (1984) 223:1299; Jay et al . , J. Biol. Chem. 
(1984) 259:6311; Stemmer, W.P.C., (1995) Gene 164:49-53. 

Next, the gene sequence encoding the desired antigen 
can be inserted into a vector containing a synthetic 
expression cassette of the present invention (e.g., see 
Example 1 for construction of various exemplary synthetic 
expression cassette) . The antigen is inserted into the 
synthetic coding sequence such that when the combined 
sequence is expressed it results in the production of VLPs 
comprising the polypeptide and/or the antigen of interest. 
Insertions can be made within the Gag coding sequence or at 
either end of the coding sequence (5', amino terminus of the 
expressed polypeptide; or 3 ' , carboxy terminus of the 
expressed polypeptide -- e.g., see Example 1) (Wagner, R., et 
al., Arch Virol. 127 :117-137 , 1992; Wagner, R. , et al . , 
Virology 200:162-175, 1994; Wu, X., et al . , J". Virol. 
69 (6) =3389-3398, 1995; Wang, C-T., et al . , Virology 200 : 524- 
534, 1994; Chazal, N. , et al . , Virology 68 (1) : 111- 122 , 1994; 
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Griffiths, J.C., et al . , J". Virol. 67 (6) : 3191-3198, 1993; 
Reicin, A.S., et al . , J. Virol. 69 (2) :642-650, 1995). 

Up to 50% of the coding sequences of p55Gag can be 
deleted without affecting the assembly to virus-like 
particles and expression efficiency (Borsetti, A., et al, 
J. Virol. 72 (11) :9313-9317, 1998; Gamier, L., et al.,. J 
Virol 72 (6) :4667-4677, 1998; Zhang, Y. , et al . , J Virol 
72 (3) .-1782-1789, 1998; Wang, C, et al . , J Virol 72(10): 
7950-7959, 1998) . In one embodiment of the present 
invention, immunogenicity of the high level expressing 
synthetic p55GagMod and p55GagProtMod expression cassettes 
can be increased by the insertion of different structural or 
non- structural HIV antigens, multiepitope cassettes, or 
cytokine sequences into deleted, mutated or truncated 
regions of p55GagMod sequence. In another embodiment of the 
present invention, immunogenicity of the high level 
expressing synthetic Env expression cassettes can be 
increased by the insertion of different structural or non- 
structural HIV antigens, multiepitope cassettes, or cytokine 
sequences into deleted regions of gpl20Mod, gpl40Mod or 
gpl60Mod sequences. Such deletions may be generated 
following the teachings of the present invention and 
information available to one of ordinary skill in the art. 
One possible advantage of this approach, relative to using 
full-length modified Env sequences fused to heterologous 
polypeptides, can be higher expression/secretion efficiency 
and/or higher immunogenicity of the expression product. Such 
deletions may be generated following the teachings of the 
present invention and information available to one of 
ordinary skill in the art. One possible advantage of this 
approach, relative to using full-length Env, Gag or Tat 
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sequences fused to heterologous polypeptides, can be higher 
expression/secretion efficiency and/or immunogenicity of the 
expression product. 

When sequences are added to the amino terminal end of 
5 Gag (for example, when using the synthetic p55GagMod 
expression cassette of the present invention) , the 
polynucletide can contain coding sequences at the 5 1 end 
that encode a signal for addition of a myristic moiety to 
the Gag- containing polypeptide (e.g., sequences that encode 
10 Met-Gly) . 

The ability of Gag-containing polypeptide constructs to 
form VLPs can be empirically determined following the 
teachings of the present specification. 

HIV polypeptide/antigen synthetic expression cassettes 
15 include control elements operably linked to the coding 

sequence, which allow for the expression of the gene in vivo 
in the subject species. For example, typical promoters for 
mammalian cell expression include the SV40 early promoter, a 
CMV promoter such as the CMV immediate early promoter, the 

2 0 mouse mammary tumor virus LTR promoter, the adenovirus major 

late promoter (Ad MLP) , and the herpes simplex virus 
promoter, among others. Other nonviral promoters, such as a 
promoter derived from the murine metallothionein gene, will 
also find use for mammalian expression. Typically, 
25 transcription termination and polyadenylation sequences will 
also be present, located 3 ? to the translation stop codon. 
Preferably, a sequence for optimization of initiation of 
translation, located 5' to the coding sequence, is also 
present. Examples of transcription 

3 0 terminator /polyadenylation signals include those derived 
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from SV40, as described in Sambrook et al . , supra, as well 
as a bovine growth hormone terminator sequence. 

Enhancer elements may also be used herein to increase 
expression levels of the mammalian constructs. Examples 
include the SV4 0 early gene enhancer, as described in 
Dijkema et al . , EMBO J. (1985) 4:761, the enhancer/promoter 
derived from the long terminal repeat (LTR) of the Rous 
Sarcoma Virus, as described in Gorman et al . , Proc. Natl. 
Acad. Bex. USA (1982b) 79:6777 and elements derived from 
human CMV, as described in Boshart et al., Cell (1985) 
41:521, such as elements included in the CMV intron A 
sequence . 

. Furthermore, plasmids can be constructed which include 
a chimeric antigen- coding gene sequences, encoding, e.g., 
multiple antigens/epitopes of interest, for example derived 
from a single or from more than one viral isolate. 

Typically the antigen coding sequences precede or 
follow the synthetic coding sequences and the chimeric 
transcription unit will have a single open reading frame 
encoding both the antigen of interest and the synthetic Gag 
coding sequences. Alternatively, multi-cistronic cassettes 
(e.g., bi-cistronic cassettes) can be constructed allowing 
expression of multiple antigens from a single mRNA using the 
EMCV IRES, or the like. Lastly, antigens can be encoded on 
separate transcripts from independent promoters on a single 
plasmid or other vector. 

Once complete, the constructs are used for nucleic acid 
immunization or the like using standard gene delivery 
protocols. Methods for gene delivery are known in the art. 
See, e.g., U.S. Patent Nos. 5,399,346, 5,580,859, 5,589,466. 
Genes can be delivered either directly to the vertebrate 
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subject or, alternatively, delivered ex vivo, to cells 
derived from the subject and the cells reimplanted in the 
subject . 

A number of viral based systems have been developed for 
5 gene transfer into mammalian cells. For example, 

retroviruses provide a convenient platform for gene delivery 
systems. Selected sequences can be inserted into a vector 
and packaged in retroviral particles using techniques known 
in the art. The recombinant virus can then be isolated and 

10 delivered to cells of the subject either in vivo or ex vivo. 
A number of retroviral systems have been described (U.S. 
Patent No. 5,219,740; Miller and Rosman, BioTechniques 
(1989) 7:980-990; Miller, A.D., Human Gene Therapy (1990) 
1:5-14; Scarpa et al . , Virology (1991) 180:849-852; Burns et 

15 al., Proc. Natl. Acad. Sci. USA (1993) 90:8033-8037; and 
Boris-Lawrie and Temin, Cur. Opin. Genet. Develop. (1993) 
3 :102-109. 

A number of adenovirus vectors have also been 
described. Unlike retroviruses which integrate into the 

20 host genome, adenoviruses persist extrachromosomally thus 

minimizing the risks associated with insertional mutagenesis 
(Haj -Ahmad and Graham, J. Virol. (1986) 57:267-274; Bett et 
al., J. Virol. (1993) 67:5911-5921; Mittereder et al . , Human 
Gene Therapy (1994) 5:717-729; Seth et al . , J. Virol. (1994) 

25 68:933-940; Barr et al . , Gene Therapy (1994) 1:51-58; 

Berkner, K.L. BioTechniques (1988) 6:616-629; and Rich et 
al., Human Gene Therapy (1993) 4:461-476). 

Additionally, various adeno-associated virus (AAV) 
vector systems have been developed for gene delivery. AAV 

3 0 vectors can be readily constructed using techniques well 

known in the art. See, e.g., U.S. Patent Nos. 5,173,414 and 
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5,139,941; International Publication Nos. WO 92/01070 
(published 23 January 1992) and WO 93/03769 (published 4 
March 1993); Lebkowski et al . , Molec. Cell. Biol. (1988) 
8:3988-3996; Vincent et al . , Vaccines 90 (1990) (Cold Spring 
5 Harbor Laboratory Press); Carter, B.J. Current Opinion in 
Biotechnology (1992) 3:533-539; Muzyczka, N. Current Topics 
in Microbiol, and Immunol. (1992) 158:97-129; Kotin, R.M. 
Human Gene Therapy (1994) 5:793-801; Shelling and Smith, 
Gene Therapy (1994) 1:165-169; and Zhou et al . , J. Exp. Med. 

10 (1994) 179:1867-1875. 

Another vector system useful for delivering the 
polynucleotides of the present invention is the enterically 
administered recombinant poxvirus vaccines described by 
Small, Jr., P. A., et al . (U.S. Patent No. 5,676,950, issued 

15 October 14, 1997, herein incorporated by reference) . 

Additional viral vectors which will find use for 
delivering the nucleic acid molecules encoding the antigens 
of interest include those derived from the pox family of 
viruses, including vaccinia virus and avian poxvirus. By 

20 way of example, vaccinia virus recombinants expressing the 
genes can be constructed as follows. The DNA encoding the 
particular synthetic Gag/antigen coding sequence is first 
inserted into an appropriate vector so that it is adjacent 
to a vaccinia promoter and flanking vaccinia DNA sequences, 

2 5 such as the sequence encoding thymidine kinase (TK) . This 

vector is then used to transfect cells which are 
simultaneously infected with vaccinia. Homologous 
recombination serves to insert the vaccinia promoter plus 
the gene encoding the coding sequences of interest into the 

3 0 viral genome. The resulting TKTecombinant can be selected 

by culturing the cells in the presence of 5- 
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bromodeoxyuridine and picking viral plaques resistant 
thereto. 

Alternatively, avipoxvi ruses, such as the fowlpox and 
canarypox viruses, can also be used to deliver the genes. 
5 Recombinant avipox viruses, expressing immunogens from 

mammalian pathogens, are known to confer protective immunity 
when administered to non-avian species. The use of an 
avipox vector is particularly desirable in human and other 
mammalian species since members of the avipox genus can only 

10 productively replicate in susceptible avian species and 

therefore are not infective in mammalian cells. Methods for 
producing recombinant avipoxviruses are known in the art and 
employ genetic recombination, as described above with 
respect to the production of vaccinia viruses. See, e.g., 

15 WO 91/12882; WO 89/03429; and WO 92/03545. 

Molecular conjugate vectors, such as the adenovirus 
chimeric vectors described in Michael et al . , J. Biol. Chem. 
(1993) 268:6866-6869 and Wagner et al . , Proc . Natl. Acad. 
Sci. USA (1992) 89:6099-6103, can also be used for gene 

2 0 delivery. 

Members of the Alphavirus genus, such as, but not 
limited to, vectors derived from the Sindbis, Semliki 
Forest, and Venezuelan Equine Encephalitis viruses, will 
also find use as viral vectors for delivering the 

25 polynucleotides of the present invention (for example, a 
synthetic Gag- or Env-polypeptide encoding expression 
cassette as described in Example 14 below) . For a 
description of Sindbis -virus derived vectors useful for the 
practice of the instant methods, see, Dubensky et al . , J". 

30 Virol. (1996) 7j0: 508-519; and International Publication Nos . 
WO 95/07995 and WO 96/17072; as well as, Dubensky, Jr., 
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T.W., et al., U.S. Patent No. 5,843,723, issued December 1, 
1998, and Dubensky, Jr., T.W., U.S. Patent No. 5,789,245, 
issued August 4, 1998, both herein incorporated by 
reference. 

5 A vaccinia based inf ection/transf ection system can be 

conveniently used to provide for inducible, transient 
expression of the coding sequences of interest (for example, 
a synthetic Gag/HCV-core expression cassette) in a host 
cell. In this system, cells are first infected in vitro 

10 with a vaccinia virus recombinant that encodes the 

bacteriophage T7 RNA polymerase. This polymerase displays 
exquisite specificity in that it only transcribes templates 
bearing T7 promoters. Following infection, cells are 
transfected with the polynucleotide of interest, driven by a 

15 T7 promoter. The polymerase expressed in the cytoplasm from 
the vaccinia virus recombinant transcribes the transfected 
DNA into RNA which is then translated into protein by the 
host translational machinery. The method provides for high 
level, transient, cytoplasmic production of large quantities 

20 of RNA and its translation products. See, e.g., Elroy-Stein 
and Moss, Proc. Natl. Acad. Sci . USA (1990) 87:6743-6747; 
Fuerst et al . , Proc. Natl. Acad. Sci. USA (1986) 83:8122- 
8126. 

As an alternative approach to infection with vaccinia 
25 or avipox virus recombinants, or to the delivery of genes 
using other viral vectors, an amplification system can be 
used that will lead to high level expression following 
introduction into host cells. Specifically, a T7 RNA 
polymerase promoter preceding the coding region for T7 RNA 
3 0 polymerase can be engineered. Translation of RNA derived 

from this template will generate T7 RNA polymerase which in 
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turn will transcribe more template. Concomitantly, there 
will be a cDNA whose expression is under the control of the 
T7 promoter. Thus, some of the T7 RNA polymerase generated 
from translation of the amplification template RNA will lead 
5 to transcription of the desired gene. Because some T7 RNA 
polymerase is required to initiate the amplification,. T7 RNA 
polymerase can be introduced into cells along with the 
template (s) to prime the transcription reaction. The 
polymerase can be introduced as a protein or on a plasmid 

10 encoding the RNA polymerase. For a further discussion of T7 
systems and their use for transforming cells, see, e.g., 
International Publication No. WO 94/26911; Studier and 
Moffatt, J. Mol. Biol. (1986) 189:113-130; Deng and Wolff, 
Gene (1994) 143:245-249; Gao et al . , Biochem. Biophys. Res. 

15 Coiwnun. (1994) 200 :1201-1206: Gao and Huang, Nuc. Acids Res. 
(1993) 21:2867-2872; Chen et al . , Nuc. Acids Res. (1994) 
22:2114-2120; and U.S. Patent No. 5,135,855. 

The synthetic expression cassette of interest can also 
be delivered without a viral vector. For example, the 

20 synthetic expression cassette can be packaged as DNA or RNA 
in liposomes prior to delivery to the subject or to cells 
derived therefrom. Lipid encapsulation is generally 
accomplished using liposomes which are able to stably bind 
or entrap and retain nucleic acid. The ratio of condensed 

25 DNA to lipid preparation can vary but will generally be 

around 1:1 (mg DNA:micromoles lipid), or more of lipid. For 
a review of the use of liposomes as carriers for delivery of 
nucleic acids, see, Hug and Sleight, Biochim. Biophys. Acta. 
(1991) 1097 :1-17; Straubinger et al . , in Methods of 

30 Enzymology (1983), Vol. 101, pp. 512-527. 
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Liposomal preparations for use in the present invention 
include cationic (positively charged) , anionic (negatively 
charged) and neutral preparations, with cationic liposomes 
particularly preferred. Cationic liposomes have been shown 
5 to mediate intracellular delivery of plasmid DNA (Feigner et 
al., Proc. Natl. Acad. Sci. USA (1987) 84 : 7413-7416) ; fc mRNA 
(Malone et al . , Proc. Natl. Acad. Sci. USA (1989) 86:6077- 
6081); and purified transcription factors (Debs et al . , J. 
Biol. Chem. (1990) 265:10189-10192), in functional form. 

10 Cationic liposomes are readily available. For example, 

N [1-2, 3-dioleyloxy) propyl] -N,N,N-triethylammonium (DOTMA) 
liposomes are available under the trademark Lipofectin, from 
GIBCO BRL, Grand Island, NY. (See, also, Feigner et al . , 
Proc. Natl. Acad. Sci. USA (1987) 84:7413-7416). Other 

15 commercially available lipids include (DDAB/DOPE) and 

DOTAP/DOPE (Boerhinger) . Other cationic liposomes can be 
prepared from readily available materials using techniques 
well known in the art. See, e.g., Szoka et al . , Proc. Natl. 
Acad. Sci. USA (1978) 75:4194-4198; PCT Publication No. WO 

20 90/11092 for a description of the synthesis of DOTAP (1,2- 
bis (oleoyloxy) -3- (trimethylammonio) propane) liposomes . 

Similarly, anionic and neutral liposomes are readily 
available, such as, from Avanti Polar Lipids (Birmingham, 
AL) , or can be easily prepared using readily available 

25 materials. Such materials include phosphatidyl choline, 

cholesterol , phosphatidyl ethanolamine , dioleoylphosphat idyl 
choline (DOPC) , dioleoylphosphat idyl glycerol (DOPG) , 
dioleoylphoshatidyl ethanolamine (DOPE), among others. 
These materials can also be mixed with the DOTMA and DOTAP 

3 0 starting materials in appropriate ratios. Methods for 
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making liposomes using these materials are well known in the 
art . 

The liposomes can comprise multilammelar vesicles 
(MLVs) , small unilamellar vesicles (SUVs) , or large 
5 unilamellar vesicles (LUVs) . The various liposome-nucleic 
acid complexes are prepared using methods known in th t e art . 
See, e.g., Straubinger et al . , in METHODS OF IMMUNOLOGY 
(1983), Vol. 101, pp. 512-527; Szoka et al . , Proc. Natl. 
Acad. Sci. USA (1978) 75:4194-4198; Papahad j opoulos et al . , 
10 Biochim. Biophys . Acta (1975) 394:483; Wilson et al . , Cell 
(1979) Hill)} Deamer and Bangham, Biochim. Biophys. Acta 

(1976) 443 : 629 ; Ostro et al . , Biochem. Biophys. Res. Commun. 

(1977) 76:836; Fraley et al . , Proc. Natl. Acad. Sci. USA 

(1979) 76:3348); Enoch and Strittmatter , Proc. Natl. Acad. 
15 Sci. USA (1979) 76:145); Fraley et al . , J. Biol. Chem. 

(1980) 255 :10431; Szoka and Papahad j opoulos, Proc. Natl. 
Acad. Sci. USA (1978) 75:145; and Schaef er-Ridder et al . , 
Science (1982) 215 :166. 

The DNA and/or protein antigen (s) can also be delivered 
20 in cochleate lipid compositions similar to those described 
by Papahad j opoulos et al . , Biochem. Biophys. Acta. (1975) 
394:483-491. See, also, U.S. Patent Nos . 4,663,161 and 
4, 871,488. 

The synthetic expression cassette of interest (e.g., 
25 any of the synthetic expression cassettes described in 
Example 1) may also be encapsulated, adsorbed to, or 
associated with, particulate carriers. Such carriers 
present multiple copies of a selected antigen to the immune 
system and promote migration, trapping and retention of 
30 antigens in local lymph nodes. The particles can be taken 
up by profession antigen presenting cells such as 
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macrophages and dendritic cells, and/or can enhance antigen 
presentation through other mechanisms such as stimulation of 
cytokine release. Examples of particulate carriers include 
those derived from polymethyl methacrylate polymers, as well 
5 as microparticles derived from poly (lactides) and 

poly (lactide-co-glycolides) , known as PLG. See, e.g.^, 
Jeffery et al . , Pharm. Res. (1993) 10:362-368; McGee JP, et 
al., J Microencapsul. 14 (2) : 197-210, 1997; O'Hagan DT, et 
al., Vaccine 11 (2) : 149-54, 1993. 
10 Furthermore, other particulate systems and polymers can 

be used for the in vivo or ex vivo delivery of the gene of 
% interest. For example, polymers such as polylysine, 

polyarginine , polyornithine, spermine, spermidine, as well 
Ifl as conjugates of these molecules, are useful for 

f} 15 transferring a nucleic acid of interest. Similarly, DEAE 
Iji dextran-mediated transf ection, calcium phosphate 

precipitation or precipitation using other insoluble 
fyi inorganic salts, such as strontium phosphate, aluminum 

J? silicates including bentonite and kaolin, chromic oxide, 

20 magnesium silicate, talc, and the like, will find use with 
^ the present methods. See, e.g., Feigner, P.L., Advanced 

Drug Delivery Reviews (1990) 5:163-187, for a review of 
delivery systems useful for gene transfer. Peptoids 
(Zuckerman, R.N. , et al . , U.S. Patent No. 5,831,005, issued 
25 November 3, 1998, herein incorporated by reference) may also 
be used for delivery of a construct of the present 
invention. 

Additionally, biolistic delivery systems employing 
particulate carriers such as gold and tungsten, are 
3 0 especially useful for delivering synthetic expression 
cassettes of the present invention. The particles are 
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coated with the synthetic expression cassette (s) to be 
delivered and accelerated to high velocity, generally under 
a reduced atmosphere, using a gun powder discharge from a 
M gene gun." For a description of such techniques, and 
5 apparatuses useful therefore, see, e.g., U.S. Patent Nos . 
4,945,050; 5,036,006; 5,100,792; 5,179,022; 5,371,015; and 
5,478,744. Also, needle-less injection systems can be used 
(Davis, H.L., et al, Vaccine 12:1503-1509, 1994; Bioject, 
Inc., Portland, OR). 
10 Recombinant vectors carrying a synthetic expression 

cassette of the present invention are formulated into 
compositions for delivery to the vertebrate subject. These 
*p compositions may either be prophylactic (to prevent 

sS infection) or therapeutic (to treat disease after 

111 15 infection) . The compositions will comprise a 
IP "therapeutically effective amount" of the gene of interest 

f such that an amount of the antigen can be produced in vivo 

ry so that an immune response is generated in the individual to 

Jff which it is administered. The exact amount necessary will 

yR 20 vary depending on the subject being treated; the age and 
^ general condition of the subject to be treated; the capacity 

of the subject's immune system to synthesize antibodies; the 
degree of protection desired; the severity of the condition 
being treated; the particular antigen selected and its mode 
25 of administration, among other factors. An appropriate 

effective amount can be readily determined by one of skill 
in the art. Thus, a "therapeutically effective amount" will 
fall in a relatively broad range that can be determined 
through routine trials. 
3 0 The compositions will generally include one or more 

"pharmaceutically acceptable excipients or vehicles" such as 
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water, saline, glycerol, polyethyleneglycol , hyaluronic 
acid, ethanol, etc. Additionally, auxiliary substances, 
such as wetting or emulsifying agents, pH buffering 
substances, surfactants and the like, may be present in such 
5 vehicles. Certain facilitators of immunogenicity or of 

nucleic acid uptake and/or expression can also be included 
in the compositions or coadministered, such as, but not 
limited to, bupivacaine, cardiotoxin and sucrose. 

Once formulated, the compositions of the invention can 

10 be administered directly to the subject (e.g., as described 
above) or, alternatively, delivered ex vivo, to cells 
derived from the subject, using methods such as those 
described above. For example, methods for the ex vivo 
delivery and reimplantation of transformed cells into a 

15 subject are known in the art and can include, e.g., 
dextran-mediated transf ection, calcium phosphate 
precipitation, polybrene mediated transf ection, 
lipof ectamine and LT-1 mediated transf ection, protoplast 
fusion, electroporation, encapsulation of the 

20 polynucleotide (s) (with or without the corresponding 

antigen) in liposomes, and direct microinjection of the DNA 
into nuclei . 

Direct delivery of synthetic expression cassette 
compositions in vivo will generally be accomplished with or 

25 without viral vectors, as described above, by injection 

using either a conventional syringe, needless devices such 
as Bioject® or a gene gun, such as the Accell® gene delivery 
system (PowderJect Technologies, Inc., Oxford, England). 
The constructs can be delivered (e.g., injected) either 

30 subcutaneously, epidermally, intradermal ly, intramuscularly, 
intravenous, intramucosally (such as nasally, rectally and 
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vaginally) , intraperitoneal ly or orally. Delivery of DNA 
into cells of the epidermis is particularly preferred as 
this mode of administration provides access to skin- 
associated lymphoid cells and provides for a transient 
5 presence of DNA in the recipient . Other modes of 
administration include oral ingestion and pulmonary 
administration, suppositories, needle-less inj ection, 
transcutaneous and transdermal applications. Dosage 
treatment may be a single dose schedule or a multiple dose 
10 schedule. 

2.3.2 Ex vivo Delivery of the synthetic expression cassettes 

OF THE PRESENT INVENTION 

In one embodiment, T cells, and related cell types 

15 (including but not limited to antigen presenting cells, such 
as, macrophage, monocytes, lymphoid cells, dendritic cells, 
B-cells, T-cells, stem cells, and progenitor cells thereof), 
can be used for ex vivo delivery of the synthetic expression 
cassettes of the present invention. T cells can be isolated 

20 from peripheral blood lymphocytes (PBLs) by a variety of 

procedures known to those skilled in the art. For example, 
T cell populations can be "enriched" from a population of 
PBLs through the removal of accessory and B cells. In 
particular, T cell enrichment can be accomplished by the 

25 elimination of non-T cells using anti-MHC class II 

monoclonal antibodies. Similarly, other antibodies can be 
used to deplete specific populations of non-T cells. For 
example, anti-Ig antibody molecules can be used to deplete B 
cells and ant i -Mad antibody molecules can be used to 

3 0 deplete macrophages. 
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T cells can be further fractionated into a number of 
different subpopulations by techniques known to those 
skilled in the art. Two major subpopulations can be 
isolated based on their differential expression of the cell 
5 surface markers CD4 and CD8 . For example, following the 

enrichment of T cells as described above, CD4 + cells can be 
enriched using antibodies specific for CD4 (see Coligan et 
al., supra) . The antibodies may be coupled to a solid 
support such as magnetic beads. Conversely, CD8+ cells can 

10 be enriched through the use of antibodies specific for CD4 

(to remove CD4 + cells) , or can be isolated by the use of CD8 
antibodies coupled to a solid support. CD4 lymphocytes from 
HIV-1 infected patients can be expanded ex vivo, before or 
after transduction as described by Wilson et . al - (1995) J". 

15 Infect. Dis. 172:88. 

Following purification of T cells, a variety of methods 
of genetic modification known to those skilled in the art 
can be performed using non- viral or viral -based gene 
transfer vectors constructed as described herein. For 

2 0 example, one such approach involves transduction of the 
purified T cell population with vector-containing 
supernatant of cultures derived from vector producing cells. 
A second approach involves co-cultivation of an irradiated 
monolayer of vector-producing cells with the purified T 

25 cells. A third approach involves a similar co-cultivation 
approach; however, the purified T cells are pre-stimulated 
with various cytokines and cultured 48 hours prior to the 
co-cultivation with the irradiated vector producing cells. 
Pre-stimulation prior to such transduction increases 

30 effective gene transfer (Nolta et al . (1992) Exp. Hematol. 

20:1065). Stimulation of these cultures to proliferate also 
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provides increased cell populations for re- infusion into the 
patient. Subsequent to co-cultivation, T cells are 
collected from the vector producing cell monolayer, 
expanded, and frozen in liquid nitrogen. 
5 Gene transfer vectors, containing one or more synthetic 

expression cassette of the present invention (associated 
with appropriate control elements for delivery to the 
isolated T cells) can be assembled using known methods. 

Selectable markers can also be used in the construction 

10 of gene transfer vectors. For example, a marker can be used 
which imparts to a mammalian cell transduced with the gene 
transfer vector resistance to a cytotoxic agent. The 
cytotoxic agent can be, but is not limited to, neomycin, 
aminoglycoside, tetracycline, chloramphenicol , sulfonamide, 

15 actinomycin, netropsin, distamycin A, anthracycline , or 

pyrazinamide . For example, neomycin phosphotransferase II 
imparts resistance to the neomycin analogue geneticin 
(G418) . 

The T cells can also be maintained in a medium 
20 containing at least one type of growth factor prior to being 
selected. A variety of growth factors are known in the art 
which sustain the growth of a particular cell type. 
Examples of such growth factors are cytokine mitogens such 
as rIL-2, IL-10, IL-12, and IL-15, which promote growth and 
25 activation of lymphocytes. Certain types of cells are 
stimulated by other growth factors such as hormones, 
including human chorionic gonadotropin (hCG) and human 
growth hormone. The selection of an appropriate growth 
factor for a particular cell population is readily 
30 accomplished by one of skill in the art. 

For example, white blood cells such as differentiated 
progenitor and stem cells are stimulated by a variety of 
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growth factors. More particularly, IL-3, IL-4, IL-5, IL-6, 
IL-9, GM-CSF, M-CSF, and G-CSF, produced by activated T H and 
activated macrophages, stimulate myeloid stem cells, which 
then differentiate into pluripotent stem cells, granulocyte- 
5 monocyte progenitors, eosinophil progenitors, basophil 
progenitors, megakaryocytes, and erythroid progenitors. 
Differentiation is modulated by growth factors such as GM- 
CSF, IL-3, IL-6, IL-11, and EPO. 

Pluripotent stem cells then differentiate into lymphoid 
10 stem cells, bone marrow stromal cells, T cell progenitors, B 
cell progenitors, thymocytes, T H Cells, T c cells, and B 
% cells. This differentiation is modulated by growth factors 

Jp such as IL-3, IL-4, IL-6, IL-7, GM-CSF, M-CSF, G-CSF, IL-2, 

and IL-5. 

|J1 15 Granulocyte -monocyte progenitors differentiate to 

iri monocytes, macrophages, and neutrophils. Such 

f differentiation is modulated by the growth factors GM-CSF, 

111 M-CSF, and IL-8. Eosinophil progenitors differentiate into 

eosinophils. This process is modulated by GM-CSF and IL-5. 
; jj 20 The differentiation of basophil progenitors into mast 

^ cells and basophils is modulated by GM-CSF, IL-4, and IL-9. 

Megakaryocytes produce platelets in response to GM-CSF, EPO, 

and IL-6. Erythroid progenitor cells differentiate into red 

blood cells in response to EPO. 

2 5 Thus, during activation by the CD3 -binding agent, T 

cells can also be contacted with a mitogen, for example a 
cytokine such as IL-2. In particularly preferred 
embodiments, the IL-2 is added to the population of T cells 
at a concentration of about 50 to 100 /xg/ml. Activation 

3 0 with the CD3 -binding agent can be carried out for 2 to 4 

days . 
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Once suitably activated, the T cells are genetically 
modified by contacting the same with a suitable gene 
transfer vector under conditions that allow for transfection 
of the vectors into the T cells. Genetic modification is 
5 carried out when the cell density of the T cell population 
is between about 0.1 x 10 6 and 5 x 10 6 , preferably between 
about 0.5 x 10 6 and 2 x 10 6 . A number of suitable viral and 
nonviral -based gene transfer vectors have been described for 
use herein. 

10 After transduction, transduced cells are selected away 

from non- transduced cells using known techniques. For 
example, if the gene transfer vector used in the 
transduction includes a selectable marker which confers 
resistance to a cytotoxic agent, the cells can be contacted 

15 with the appropriate cytotoxic agent, whereby non- transduced 
cells can be negatively selected away from the transduced 
cells. If the selectable marker is a cell surface marker, 
the cells can be contacted with a binding agent specific for 
the particular cell surface marker, whereby the transduced 

20 cells can be positively selected away from the population. 
The selection step can also entail fluorescence-activated 
cell sorting (FACS) techniques, such as where FACS is used 
to select cells from the population containing a particular 
surface marker, or the selection step can entail the use of 

25 magnetically responsive particles as retrievable supports 
for target cell capture and/or background removal. 

More particularly, positive selection of the transduced 
cells can be performed using a FACS cell sorter (e.g. a 
FACSVantage™ Cell Sorter, Beet on Dickinson Immunocytometry 

3 0 Systems, San Jose, CA) to sort and collect transduced cells 
expressing a selectable cell surface marker. Following 
transduction, the cells are stained with fluorescent -labeled 
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antibody molecules directed against the particular cell 
surface marker. The amount of bound antibody on each cell 
can be measured by passing droplets containing the cells 
through the cell sorter. By imparting an electromagnetic 
5 charge to droplets containing the stained cells, the 

transduced cells can be separated from other cells. The 
positively selected cells are then harvested in sterile 
collection vessels. These cell sorting procedures are 
described in detail, for example, in the FACSVantage™ 

10 Training Manual, with particular reference to sections 3-11 
to 3-28 and 10-1 to 10-17. 

Positive selection of the transduced cells can also be 
performed using magnetic separation of cells based on 
expression or a particular cell surface marker. In such 

15 separation techniques, cells to be positively selected are 
first contacted with specific binding agent (e.g., an 
antibody or reagent the interacts specifically with the cell 
surface marker) . The cells are then contacted with 
retrievable particles (e.g., magnetically responsive 

2 0 particles) which are coupled with a reagent that binds the 
specific binding agent (that has bound to the positive 
cells) . The cell-binding agent -particle complex can then be 
physically separated from non-labeled cells, for example 
using a magnetic field. When using magnetically responsive 

25 particles, the labeled cells can be retained in a container 
using a magnetic filed while the negative cells are removed. 
These and similar separation procedures are known to those 
of ordinary skill in the art. 

Expression of the vector in the selected transduced 

30 cells can be assessed by a number of assays known to those 
skilled in the art. For example, Western blot or Northern 
analysis can be employed depending on the nature of the 
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inserted nucleotide sequence of interest. Once expression 
has been established and the transformed T cells have been 
tested for the presence of the selected synthetic expression 
cassette, they are ready for infusion into a patient via the 
5 peripheral blood stream. 

The invention includes a kit for genetic modification 
of an ex vivo population of primary mammalian cells. The 
kit typically contains a gene transfer vector coding for at 
least one selectable marker and at least one synthetic 
10 expression cassette contained in one or more containers, 

ancillary reagents or hardware, and instructions for use of 
the kit . 

Experimental 

15 Below are examples of specific embodiments for carrying 

out the present invention. The examples are offered for 
illustrative purposes only, and are not intended to limit 
the scope of the present invention in any way. 

Efforts have been made to ensure accuracy with respect 

20 to numbers used (e.g., amounts, temperatures, etc.), but 

some experimental error and deviation should, of course, be 
allowed for. 
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Example 1 

Generation of Synthetic Gag and Env Expression Cassettes 
A. Modification of HIV-1 Gag, Gag-protease. Gag- reverse 
5 transcriptase and Gag -polymerase Nucleic Acid Coding 

Sequences 

The Gag (SEQ ID N0:1), Gag-protease (SEQ ID N0:2), Gag- 
polymerase (SEQ ID N0:3) , and Gag-reverse transcriptase (SEQ 
ID NO: 77) coding sequences were selected from the HIV-1SF2 

10 strain (Sanchez-Pescador , R., et al . , Science 227(4686): 
484-492, 1985; Luciw, P. A., et al . U.S. Patent No. 
5,156,949, issued October 20, 1992, herein incorporated by 
reference; Luciw, P. A., et al., U.S. Patent No. 5,688,688, 
November 18, 1997) . These sequences were manipulated to 

15 maximize expression of their gene products. 

First, the HIV-1 codon usage pattern was modified so 
that the resulting nucleic acid coding sequence was 
comparable to codon usage found in highly expressed human 
genes. The HIV codon usage reflects a high content of the 

20 nucleotides A or T of the codon-triplet . The effect of the 
HIV-1 codon usage is a high AT content in the DNA sequence 
that results in a high AU content in the RNA and in a 
decreased translation ability and instability of the mRNA. 
In comparison, highly expressed human codons prefer the 

25 nucleotides G or C. The Gag-encoding sequences were 

modified to be comparable to codon usage found in highly 
expressed human genes. 

Figure 11 presents a comparison of the percent A-T 
content for the cDNAs of stable versus unstable RNAs 

3 0 (comparison window size = 50) . Human IFNy mRNA is known to 

(i) be unstable, (ii) have a short half-life, and (iii) have 



115 



1621.002 

2302-1621 

PATENT 



a high A-U content. Human GAPDH (glyceraldehyde- 3 -phosphate 
dehydrogenase) mRNA is known to (i) be a stable RNA, and (i) 
have a low A-U content. In Figure 11, the percent A-T 
content of these two sequences are compared to the percent 
5 A-T content of native HIV-1SF2 Gag cDNA and to the synthetic 
Gag cDNA sequence of the present invention. The top .two 
panels of the figure show the percent A-T content over the 
length of the sequences for IFNy and native Gag. The bottom 
two panels of the figure show the percent A-T content over 

10 the length of the sequences for GAPDH and the synthetic Gag. 
Experiments performed in support of the present invention 
showed that the synthetic Gag sequences were capable of 
higher level of protein production (see the Examples) than 
the native Gag sequences. The data in Figure 11 suggest 

15 that one reason for this increased production may be 
increased stability of the mRNA corresponding to the 
synthetic Gag coding sequences versus the mRNA corresponding 
to the native Gag coding sequences. 

Second, there are inhibitory (or instability) elements 

2 0 (INS) located within the coding sequences of the Gag and 

Gag-protease coding sequences (Schneider R, et al . , J Virol. 
71 (7) :4892-4903 , 1997). RRE is a secondary RNA structure 
that interacts with the HIV encoded Rev-protein to overcome 
the expression down- regulating effects of the INS. To 
25 overcome the requirement for post-transcriptional activating 
mechanisms of RRE and Rev, and to enhance independent 
expression of the Gag polypeptide, the INS were inactivated 
by introducing multiple point mutations that did not alter 
the reading frame of the encoded proteins. Figure 1 shows 

3 0 the original SF2 Gag sequence, the location of the INS 
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sequences, and the modifications made to the INS sequences 
to reduce their effects. 

For the Gag-protease sequence (wild type, SEQ ID NO: 2; 
synthetic, SEQ ID N0s:5, 78 and 79), the changes in codon 
5 usage were restricted to the regions up to the -1 frameshift 
and starting again at the end of the Gag reading frame 
(Figure 2; the region indicated in lower case letters in 
Figure 2 is the unmodified region) . Further, inhibitory (or 
instability) elements (INS) located within the coding 

10 sequences of the Gag-protease polypeptide coding sequence 

were altered as well (indicated in Figure 2) . The synthetic 
coding sequences were assembled by the Midland Certified 
Reagent Company (Midland, Texas) . 

Modification of the Gag-polymerase sequences (wild 

15 type, SEQ ID NO: 3; synthetic, SEQ ID NO: 6) and Gag -reverse 
transcriptase sequences (SEQ ID NOs:80 through 84) include 
similar modifications as described for Gag-protease in order 
to preserve the frameshift region. Locations of the 
inactivation sites and changes to the sequence to alter the 

20 inactivation sites are presented in Figure 12 for the native 
HIV-1 SF2 Gag-polymerase sequence. 

In one embodiment of the invention, the full length 
polymerase coding region of the Gag-polymerase sequence is 
included with the synthetic Gag sequences in order to 

25 increase the number of epitopes for virus-like particles 
expressed by the synthetic, optimized Gag expression 
cassette. Because synthetic HIV-1 Gag-polymerase expresses 
the potentially deleterious functional enzymes reverse 
transcriptase (RT) and integrase (INT) (in addition to the 

3 0 structural proteins and protease) , it is important to 

inactivate RT and INT functions. Several in- frame deletions 
in the RT and INT reading frame can be made to achieve 
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catalytic nonfunctional enzymes with respect to their RT and 
INT activity. {Jay. A. Levy (Editor) (1995) The 
Retroviridae , Plenum Press, New York. ISBN 0-306-45033X. 
Pages 215-20; Grimison, B. and Laurence, J. (1995) , Journal 
5 Of Acquired Immune Deficiency Syndromes and Human 

Retrovirology 9 (1) :58-68; Wakefield, J. K.,et al . , (1992) 
Journal Of Virology 66 (11) : 6806-6812 ; Esnouf, R.,et al . 7 
(1995) Nature Structural Biology 2 (4) :303-308 ; Maignan, S., 
et al., (1998) Journal Of Molecular Biology 282(2) : 359-368; 

10 Katz, R. A. and Skalka, A. M. (1994) Annual Review Of 

Biochemistry 13 (1994); Jacobo-Molina, A., et al . , (1993) 
Proceedings Of the National Academy Of Sciences Of the 
United States Of America 90 (13 ): 6320-6324 ; Hickman, A. B., 
et al., (1994) Journal Of Biological Chemistry 

15 269 (46) :29279-29287; Goldgur, Y., et al . , (1998) Proceedings 
Of the National Academy Of Sciences Of the United States Of 
America 95 (16) : 9150-9154 ; Goette, M. , et al . , (1998) Journal 
Of Biological Chemistry 273 (17) : 10139-10146 ; Gorton, J. L. , 
et al., (1998) Journal of Virology 72 (6) : 5046-5055 ; 

20 Engelman, A., et al . , (1997) Journal Of Virology 71(5):3507- 
3514; Dyda, F . , et al . , Science 266 (5193) : 1981-1986 ; Davies, 
J. F., et al., (1991) Science 252 (5002) : 88-95 ; Bujacz, G. , 
et al., (1996) Febs Letters 398 (2-3) : 175-178 ; Beard, W. A., 
et al . , (1996) Journal Of Biological Chemistry 

25 271 (21) :12213-12220; Kohlstaedt, L. A., et al . , (1992) 

Science 256 (5065) : 1783-1790; Krug, M. S. and Berger, S. L. 
(1991) Biochemistry 30 (44) : 10614-10623 ; Mazumder, A., et 
al., (1996) Molecular Pharmacology 49(4) : 621-628; 
Palaniappan, C, et al . , (1997) Journal Of Biological 

30 Chemistry 272 (17) : 11157-11164 ; Rodgers, D. W. , et al . , 

(1995) Proceedings Of the National Academy Of Sciences Of 
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the United States Of America 92 (4) : 1222-1226 ; Sheng, N. and 
Dennis, D. (1993) Biochemistry 32 (18) :4938-4942 ; Spence, R. 
A., etal., (1995) Science 267 (5200) : 988-993 . } 

Furthermore selected B- and/or T-cell epitopes can be 
5 added to the Gag-polymerase constructs within the deletions 
of the RT- and INT-coding sequence to replace and augment 
any epitopes deleted by the functional modifications of RT 
and INT. Alternately, selected B- and T-cell epitopes 
(including CTL epitopes) from RT and INT can be included in 

10 a minimal VLP formed by expression of the synthetic Gag or 
synthetic GagProt cassette, described above. (For 
descriptions of known HIV B- and T-cell epitopes see, HIV 
Molecular Immunology Database CTL Search Interface; Los 
Alamos Sequence Compendia, 1987-1997 ; Internet address: 

15 http : / /hiv-web. lanl .gov/ immunology/ index. html . ) 

The resulting modified coding sequences are presented 
as a synthetic Gag expression cassette (SEQ ID NO: 4) , a 
synthetic Gag-protease expression cassette (SEQ ID NOs:5, 78 
and 79) , and a synthetic Gag-polymerase expression cassette 

20 (SEQ ID NO:6). Synthetic expression cassettes containing 

codon modifications in the reverse transcriptase region are 
shown in SEQ ID NOs:80 through 84. An alignment of selected 
sequences is presented in Figure 7. A common region (Gag- 
common; SEQ ID NO: 9) extends from position 1 to position 

25 1262. 

The synthetic DNA fragments for Gag and Gag-protease 
were cloned into the following expression vectors: pCMVKm2, 
for transient expression assays and DNA immunization 
studies, the pCMVKm2 vector was derived from pCMV6a (Chapman 
30 et al., Nuc. Acids Res. (1991) 19:3979-3986) and comprises a 
kanamycin selectable marker, a ColEl origin of replication, 
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a CMV promoter enhancer and Intron A, followed by an 
insertion site for the synthetic sequences described below 
followed by a polyadenylation signal derived from bovine 
growth hormone -- the pCMVKm2 vector differs from the pCMV- 
5 link vector only in that a polylinker site was inserted into 
pCMVKm2 to generate pCMV-link (Figure 14, polylinker at 
positions 1646 to 1697) ; pESN2dhfr (Figure 13A) and 
pCMVPLEdhfr (also known as pCMVIII as shown in Figure 13B) , 
for expression in Chinese Hamster Ovary (CHO) cells; and, 

10 pAcC13, a shuttle vector for use in the Baculovirus 

expression system (pAcC13, was derived from pAcC12 which was 
described by Munemitsu S., et al., Mol Cell Biol. 
10 (11) :5977-5982, 1990) . 

A restriction map for vector pCMV-link is presented in 

15 Figure 14. In the figure, the CMV promoter (CMV IE 
ENH/PRO) , bovine growth hormone terminator (BGH pA) , 
kanamycin selectable marker (kan) , and a ColEl origin of 
replication (ColEl ori) are indicated. A polycloning site 
is also indicated in the figure following the CMV promoter 

20 sequences. 

A restriction map for vector pESN2dhfr is presented in 
Figure 13A. In the figure, the CMV promoter (pCMV, hCMVIE) , 
bovine growth hormone terminator (BGHpA) , SV4 0 origin of 
replication (SV40ori) , neomycin selectable marker (Neo) , 

2 5 SV4 0 polyA (SV40pA) , Adenovirus 2 late promoter (Ad2VLP) , 
and the murine dhfr gene (mu dhfr) are indicated. A 
polycloning site is also indicated in the figure following 
the CMV promoter sequences. 

Briefly, construction of pCMVPLEdhfr (pCMVIII) was as 

30 follows. To construct a DHFR cassette, the EMCV IRES 

(internal ribosome entry site) leader was PCR-amplif ied from 
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pCite-4a+ (Novagen, Inc., Milwaukee, WI) and inserted into 
pET-23d (Novagen, Inc., Milwaukee, WI) as an Xba-Nco 
fragment to give pET-EMCV. The dhfr gene was PCR-amplif ied 
from pESN2dhfr to give a product with a Gly-Gly-Gly-Ser 
5 spacer in place of the translation stop codon and inserted 
as an Nco- BamHl fragment to give pET-E-DHFR. Next, the 
attenuated neo gene was PCR amplified from a pSV2Neo 
(Clontech, Palo Alto, CA) derivative and inserted into the 
unique BamHl site of pET-E-DHFR to give pET-E-DHFR/Neo {m2) . 

10 Then, the bovine growth hormone terminator from pCDNA3 

(Invitrogen, Inc., Carlsbad, CA) was inserted downstream of 
the neo gene to give pET-E-DHFR/Neo (m2) BGHt . The EMCV- 
dhfr/neo selectable marker cassette fragment was prepared by 
cleavage of pET-E-DHFR/Neo (m2) BGHt . The CMV enhancer/promoter 

15 plus Intron A was transferred from pCMV6a (Chapman et al . , 
Nuc. Acids Res. (1991) 19:3979-3986) as a Hindlll-Sall 
fragment into pUC19 (New England Biolabs, Inc., Beverly, 
MA) . The vector backbone of pUC19 was deleted from the Ndel 
to the Sapl sites. The above described DHFR cassette was 

2 0 added to the construct such that the EMCV IRES followed the 
CMV promoter to produce the final construct . The vector 
also contained an amp r gene and an SV40 origin of 
replication. 

Selected pCMVKm2 vectors containing the synthetic 
25 expression cassettes have been designated as follows: 
pCMVKm2 . GagMod . SF2 , pCMVKm2 . GagprotMod . SF2 , and 
pCMVKm2 . GagpolMod . SF2 , pCMVKm2 . GagprotMod . SF2 . GP1 (SEQ ID 
NO:78) and pCMVKm2 . GagprotMod. SF2 . GP2 (SEQ ID NO:79). Other 
exemplary Gag-encoding expressing cassettes are shown in the 
30 Figures and as Sequence Listings. 
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B. Modification of HIV-1 Gagr/Hepatitis C Core Chimeric 
Protein Nucleic Acid Coding Sequences Generation of 
Synthetic Expression Cassettes 

To facilitate the ligation of the Gag and HCV core 
5 coding sequences, PCR amplification was employed. The 
synthetic p55Gag expression cassette was used as a PCR 
template with the following primers: GAGS (SEQ ID N0:11) and 
P55-SAL3 (SEQ ID NO: 12) . The PCR amplification was conducted 
at 55°C for 25 cycles using Stratagene's Pfu polymerase. 

10 The resulting PCR product was rendered free of nucleotides 
and primers using the Promega PCR clean-up kit and then 
subjected to EcoRI and Sail digestions. For HCV core coding 
sequences, the following primers were used with an HCV 
template (Houghton, M. , et al . , U.S. Patent No. 5,714,596, 

15 issued February 3, 1998; Houghton, M. , et al . , U.S. Patent 
No. 5,712,088, issued January 27, 1998; Houghton, M . , et 
al., U.S. Patent No. 5,683,864, issued November 4, 1997; 
Weiner, A.J., et al . , U.S. Patent No. 5,728,520, issued 
March 17, 1998; Weiner, A.J., et al . , U.S. Patent No. 

20 5,766,845, issued June 16, 1998; Weiner, A. J. , et al . , U.S. 
Patent No. 5,670,152, issued September 23, 1997; all herein 
incorporated by reference): CORESAL 5 (SEQ ID NO: 13) and 
173CORE(SEQ ID NO: 14) using the conditions outlined above. 
The purified product was digested with Sail and BamHI 

25 restriction enzymes. The digested Gag and HCV core PCR 

products were ligated into the pCMVKm2 vector digested with 
EcoRI and BamHI. Ligation of the PCR products at the Sail 
site resulted in a direct fusion of the final amino acid of 
p55Gag to the second amino acid of HCV core, serine. Amino 

3 0 acid 173 of core is a serine and is followed immediately by 
a TAG termination codon. The sequence of the fusion clone 
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was confirmed. The pCMVKm2 vector containing the synthetic 
expression cassette was designated as pCMVKm2 . GagModHCVcore . 

The EcoRI-BamHI fragment of p55Gag-core 173 was also 
cloned into EcoRI-BamHI -digested pAcC13 for baculovirus 
5 expression. Western blots confirmed expression and sucrose 
gradient sedimentation along with electron microscopy 
confirmed particle formation. To generate the above clone 
but containing the synthetic Gag sequences (instead of wild- 
type) , the following steps were performed: pCMVKm2 -modified 

10 p55Gag was used as template for PCR amplification with MS65 
(SEQ ID NO: 15) and MS66 (SEQ ID NO: 16) primers. The region 
amplified corresponds to the BspHI and Sail sites at the C- 
terminus of synthetic Gag sequence. The amplification 
product was digested with BspHI and Sail and ligated to 

15 Sall/BamHI digested pCMV-link along with the Sal/BspHI 

fragment from pCMV-Km-p55modGag , representing the amino 
terminal end of modified Gag, and the Sall/BamHI fragment 
f rom pCMV-p55Gag-corel73 . Thereafter, a T4-blunted-SalI 
partial/BamHI fragment was ligated into pAcC4-Smal/BamHI to 

20 generate pAcC4-p55GagMod-corel73 (containing the synthetic 
sequence presented as SEQ ID NO: 7) . 

C . Defining of the Major Homology Region (MHR) of HIV-1 
p55Gag 

25 The Major Homology Region (MHR) of HIV-1 p55 (Gag) is 

located in the p24-CA sequence of Gag. It is a conserved 
stretch of 20 amino acids (SEQ ID NO: 19) . The position in 
the wild type HIV-1 SF2 Gag protein is from aa 286-305 and 
spans a region from nucleotides 856-915 in the native HIV- 

30 1 SF2 Gag DNA-sequence. The position in the synthetic Gag 
protein is from aa 288-307 and spans a region from 
nucleotides 862-921 for the synthetic Gag DNA-sequence. The 
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nucleotide sequence for the MHR in the synthetic GagMod.SF2 
is presented as SEQ ID NO: 20. Mutations or deletions in the 
amino acid sequence of the MHR can severely impair particle 
production (Borsetti, A., et al . , J. Virol. 72(11) :9313- 
5 9317, 1998; Mammano, F., et al . , J Virol 68(8)14927-4936, 
1994) . 

Percent identity to the MHR nucleotide sequence can be 
determined, for example, using the MacDNAsis program 
(Hitachi Software Engineering America Limited, South San 
10 Francisco, CA) , Higgins algorithm, with the following 

exemplary parameters: gap penalty ^5, no. of top diagonals 
= 5, fixed gap penalty = 5, K- tuple = 2, window size = 5, 
and floating gap penalty = 10. 

15 D. Generation of Synthetic Env Expression Cassettes 

Env coding sequences of the present invention include, 
but are not limited to, polynucleotide sequences encoding 
the following HIV-encoded polypeptides: gpl60, gpl40, and 
gpl20 (see, e.g., U.S. Patent No. 5,792,459 for a 

2 0 description of the HIV-1 SF2 ( "SF2 " ) Env polypeptide) . The 

relationships between these polypeptides is shown 
schematically in Figure 15 (in the figure: the polypeptides 
are indicated as lines, the amino and carboxy termini are 
indicated on the gpl60 line; the open circle represents the 
25 oligomerization domain; the open square represents a 

transmembrane spanning domain (TM) ; and u c" represents the 
location of a cleavage site, in gpl40.mut the "X" indicates 
that the cleavage site has been mutated such that it no 
longer functions as a cleavage site) . The polypeptide gpl60 

3 0 includes the coding sequences for gpl2 0 and gp41. The 

polypeptide gp41 is comprised of several domains including 
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an oligomerization domain (OD) and a transmembrane spanning 
domain (TM) . In the native envelope, the oligomerization 
domain is required for the non-covalent association of three 
gp41 polypeptides to form a trimeric structure: through 
5 non-covalent interactions with the gp41 trimer (and itself) , 
the gpl20 polypeptides are also organized in a trimeric 
structure. A cleavage site (or cleavage sites) exists 
approximately between the polypeptide sequences for gpl2 0 
and the polypeptide sequences corresponding to gp41. This 

10 cleavage site(s) can be mutated to prevent cleavage at the 
site. The resulting gpl40 polypeptide corresponds to a 
truncated form of gpl60 where the transmembrane spanning 
domain of gp41 has been deleted. This gpl40 polypeptide can 
exist in both monomeric and oligomeric (i.e. trimeric) forms 

15 by virtue of the presence of the oligomerization domain in 
the gp41 moiety. In the situation where the cleavage site 
has been mutated to prevent cleavage and the transmembrane 
portion of gp41 has been deleted the resulting polypeptide 
product is designated "mutated" gpl40 (e.g., gpl40.mut). As 

2 0 will be apparent to those in the field, the cleavage site 

can be mutated in a variety of ways. The native amino acid 
sequence in the SF162 cleavage sites is: APTKAKRRWQREKR 
(SEQ ID NO:21), where KAKRR (SEQ ID NO:22) is termed the 
"second" site and REKR (SEQ ID NO:23) is the "first site". 

25 Exemplary mutations include the following constructs: 

gpl4 0 .mut7 .modSF162 which encodes the amino acid sequence 
APTKAISSWQSEKS (SEQ ID NO: 24) in the cleavage site region ; 
gpl40 .mut8 .modSF162 which encodes the amino acid sequence 
APTIAISSWQSEKS (SEQ ID NO: 25) in the cleavage site region 

30 and gpl40mut .modSF162 which encodes the amino acid sequence 
APTKAKRRWQREKS (SEQ ID NO:26) . Mutations are denoted in 
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bold. The native amino acid sequence in the US 4 cleavage 
sites is: APTQAKRRWQREKR (SEQ ID NO : 2 7 ) , where QAKRR (SEQ 
ID NO:28) is termed the "second" site and REKR (SEQ ID 
NO:23) is the "first site". Exemplary mutations include the 
5 following construct: gpl40 .mut .modUS4 which encodes the 
amino acid sequence APTQAKRRWQREKS (SEQ ID NO: 29) in the 
cleavage site region. Mutations are denoted in bold. 

IL. Modification of HIV-1 Env (Envelope) Nucleic Acid 

10 Coding Sequences 

In one embodiment of the present invention, wild- type 
Env coding sequences were selected from the HIV-1 SF162 
("SF162") strain (Cheng-Mayer (1989) PNAS USA 86:8575-8579). 
These SF162 sequences were as follows: gpl20, SEQ ID NO: 30 

15 (Fig. 16); gpl40, SEQ ID NO:31 (Fig. 17); and gpl60, SEQ ID 
NO:32 (Fig. 18) . 

In another embodiment of the present invention, wild- 
type Env coding sequences were selected from the HIV-US4 
strain (Mascola, et al . (1994) J\ Infect. Dis. 169:48-54) . 

20 These US 4 sequences were as follows: gpl20 / SEQ ID NO: 51 

(Fig. 38); gpl40, SEQ ID NO:52 (Fig. 39); and gpl60, SEQ ID 
NO:53 (Fig. 40) . 

These Env coding sequences were manipulated to maximize 
expression of their gene products. 

25 First, the wild-type coding region was modified in one 

or more of the following ways. In one embodiment, sequences 
encoding hypervariable regions of Env, particularly VI 
and/or V2 were deleted. In other embodiments, mutations 
were introduced into sequences encoding the cleavage site in 

3 0 Env to abrogate the enzymatic cleavage of oligomeric gpl40 
into gpl2 0 monomers. (See, e.g., Earl et al . (1990) PNAS 
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USA 87:648-652; Earl et al . (1991) J. Virol. 65:31-41). In 
yet other embodiments, hypervariable region (s) were deleted, 
N-glycosylation sites were removed and/or cleavage sites 
mutated. 

5 Second, the HIV-1 codon usage pattern was modified so 

that the resulting nucleic acid coding sequence was 
comparable to codon usage found in highly expressed human 
genes. The HIV codon usage reflects a high content of the 
nucleotides A or T in the codon-triplet . The effect of the 

10 HIV-1 codon usage is a high AT content in the DNA sequence 
that results in a decreased translation ability and 
instability of the mRNA. In comparison, highly expressed 
human codons prefer the nucleotides G or C. The Env coding 
sequences were modified to be comparable to codon usage 

15 found in highly expressed human genes. 

Figures 22A-22H present comparisons of the percent A-T 
content for the cDNAs of stable versus unstable RNAs 
(comparison window size = 50) . Human IFNy mRNA is known to 
(i) be unstable, (ii) have a short half-life, and (iii) have 

20 a high A-U content. Human GAPDH (glyceraldehyde- 3 -phosphate 
dehydrogenase) mRNA is known to (i) be a stable RNA, and (i) 
have a low A-U content. In Figures 22A-H, the percent A-T 
content of these two sequences are compared to the percent 
A-T content of (1) native HIV-1 US 4 Env gpl60 cDNA, a 

25 synthetic US 4 Env gpl60 cDNA sequence (i.e., having modified 
codons) of the present invention; and (2) native HIV-1 SF162 
Env gpl60 cDNA, a synthetic SF162 Env gpl60 cDNA sequence 
(i.e., having modified codons) of the present invention. 
Figures 22A-H show the percent A-T content over the length 

3 0 of the sequences for IFNy (Figures 22C and 22G) ; native 

gpl60 Env US 4 and SF162 (Figures 22A and 22E, respectively) ; 
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GAPDH (Figures 22D and 22H) ; and the synthetic gpl60 Env for 
US 4 and SF162 (Figures 22B and 22F) . Experiments performed 
in support of the present invention showed that the 
synthetic Env sequences were capable of higher level of 
5 protein production (see the Examples) than the native Env 
sequences. The data in Figures 22A-H suggest that one 
reason for this increased production is increased stability 
of the mRNA corresponding to the synthetic Env coding 
sequences versus the mRNA corresponding to the native Env 

10 coding sequences. 

To create the synthetic coding sequences of the present 
invention the gene cassettes were designed to comprise the 
entire coding sequence of interest. Synthetic gene cassettes 
were constructed by oligonucleotide synthesis and PCR 

15 amplification to generate gene fragments. Primers were 
chosen to provide convenient restriction sites for 
subcloning. The resulting fragments were then ligated to 
create the entire desired sequence which was then cloned 
into an appropriate vector. The final synthetic sequences 

2 0 were (i) screened by restriction endonuclease digestion and 

analysis, (ii) subjected to DNA sequencing in order to 
confirm that the desired sequence had been obtained and 
(iii) the identity and integrity of the expressed protein 
confirmed by SDS-PAGE and Western blotting (See, Examples. 
25 The synthetic coding sequences were assembled at Chiron 

Corp. or by the Midland Certified Reagent Company (Midland, 
Texas) . 

Exemplary modified coding sequences are presented as 
synthetic Env expression cassettes in Table 1A and IB. The 

3 0 following expression cassettes (i) have unique, terminal 

EcoRI and Xbal cloning sites; (ii) include Kozak sequences 
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(to direct the ENV polypeptide to the cell membrane, see, 
e.g., Chapman et al . , infra); (iv) open reading frames 
optimized for expression in mammalian cells; and (v) a 
translational stop signal codon. 
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Table 1A: Exemplary Synthetic Env Expression 
Cassettes (SF162) 



Expression Cassette 


Seq Id 


Further Information 


gpl20 SF162 


30 


wild- type; Figure 16 


gpl40 SF162 


31 


wild- type; Figure 17 


gpl60 SF162 


32 


wild- type; Figure 18 


gpl20 .modSF162 


33 


none; Figure 19 


gpl2 0 . modSF162 . delV2 


34 


deleted V2 loop; Figure 2 0 


gpl20 .modSF162 ,delVl/V2 


35 


deleted VI and V2; Figure 21 


gpl40 ,modSF162 


36 


none; Figure 23 


gp!40 .modSF162 .delV2 


37 


deleted V2 loop; Figure 24 


gpl40 ,modSF162 .delVl/V2 


38 


deleted VI and V2; Figure 25 


ctd14 0 . mut . modSF162 


39 


mutated r*1 f^avacrf* cHl*^* Fi n Ofc. 


gp!40 .mut .modSF162 .delV2 


40 


deleted V2; mutated cleavage 


gpl4 0 .mut .modSF162 . delVl/V2 


41 


deleted VI & V? * mut-at^d 
cleavage site; Figure 28 


crrjl40 mut7 mcidSFI 69 


42 


luuLctucu v.xcavayc tjiuc , r ±y . 


gpl40 .mut 7 .modSF162 .delV2 


43 


mutated cleavage site; deleted 
V2 * Fiaure 3 0 


gpl40 .mut 7 ,modSF162 . delVl/V2 


44 


mutated c~\ eavaciP 1 cjit**** dRl^t'^ri 
VI and V2; Figure 31 


gpl40 .mut8 .modSF162 


45 


mutated cleavage site; Fig. 32 


gpl4 0 . mut 8 . modSFl 62 . delV2 


46 


mutated cleavage site; deleted 
V2; Figure 33 


gpl40 .mut 8 .modSF162 .delVl/V2 


47 


mutated cleavage site; deleted 
VI and V2; Figure 34 


gpl60 ,modSF162 


48 


none; Figure 35 


gpl60 .modSF162 ,delV2 


49 


deleted V2 loop; Figure 36 


gpl60 .modSF162 .delVl/V2 


50 


deleted VI & V2; Figure 37 
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Table IB: 

Exemplary Synthetic Env Expression Cassettes (US4 ) 



Expression Cassette 


Seq Id 


Further Information 


gp!20 US 4 


51 


wild- type; Figure 3 8 


gpl40 US 4 


52 


wild- type; Figure 3 9 


gplSO US 4 


53 


wild-type; Figure 40 


gpl2 0.modUS4 


54 


none; Figure 41 


gpl2 0.modUS4.del 12 8-194 


55 


deletion in VI and V2 regions; 
Figure 42 


gpl4 0 .modUS4 


56 


none; Figure 43 


gp 14 0 . mut . modUS4 


57 


mutated cleavage site; Figure 44 


gpl4 0TM.modUS4 


58 


native transmembrane region; 
Figure 45 


gpl4 0 .modUS 4 .delVl/V2 


59 


deleted VI and V2 ; Figure 46 


nTilAn mr^HTTQA r\t=^~\\70 


D U 


deleted VI; Figure 47 


yp±^t u . tnuu . moQUo^t . Qci v x/ v z 


O 1 


mutated cleavage site; deleted VI 
and V2 ; Figure 48 


gpl40 .modUS4.del 128-194 


62 


deletion in VI and V2 regions ; 
Figure 4 9 


gpl4 0.mut.modUS4 .del 12 8- 
194 


63 


mutated cleavage site; deletion 
in VI and V2 regions; Figure 50 


gpl60 .modUS4 


64 


none; Figure 51 


gpl60 ,modUS4 .del VI 


65 


deleted VI; Figure 52 


gpl60 .modUS 4 . delV2 


66 


deleted V2; Figure 53 


gpl60 .modUS 4 . delVl/V2 


67 


deleted VI and V2; Figure 54 


gpl60 .modUS4del 12 8-194 


68 


deletion in VI and V2 regions; 
Figure 55 



Alignments of the sequences presented in the above 
25 tables are presented in Figures 66A and 66B. 

A common region (Env- common) extends from nucleotide 
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position 1186 to nucleotide position 1329 (SEQ ID NO: 69, 
Fig. 56) relative to the wild- type US 4 sequence and from 
nucleotide position 1117 to position 1260 (SEQ ID NO: 79, 
Fig. 57) relative to the wild- type SF162 sequence. The 
5 synthetic sequences of the present invention corresponding 
to these regions are presented, as SEQ ID NO: 71 (Figure 58) 
for the synthetic Env US4 common region and as SEQ ID NO: 72 
(Figure 59) for the synthetic Env SF162 common region. 

Percent identity to this sequence can be determined, 

10 for example, using the Smith-Waterman search algorithm (Time 
Logic, Incline Village, NV) , with the following exemplary 
parameters: weight matrix = nuc4x4hb; gap opening penalty = 
20, gap extension penalty = 5, reporting threshold = 1; 
alignment threshold = 20. 

15 Various forms of the different embodiments of the 

present invention (e.g., constructs) may be combined. 

F^. Cloning Synthetic Env Expression Cassettes of the 
Present Invent ion . 

20 The synthetic DNA fragments encoding the Env 

polypeptides were typically cloned into the eucaryotic 
expression vectors described above for Gag, for example, 
pCMVKm2/pCMVlink (Figure 4), pCMV6a, pESN2dhfr (Figure 13A) , 
pCMVIII (Figure 13B; alternately designated as the pCMV-PL- 

25 E-dhfr/neo vector) . 

Exemplary designations for pCMVlink vectors containing 
synthetic expression cassettes of the present invention are 
as follows: pCMVlink. gpl40.modSF162; pCMVlink.gpl40 . - 
modSF162 . delV2 ; pCMVl ink . gpl4 0 . mut . modSF162 ; 

3 0 pCMVlink. gpl4 0 .mut .modSF162 .delV2 ; pCMVKm2 .gpl40modUS4 ; 

pCMVKm2 . gpl4 0 . modUS4 . delV2 ; pCMVKm2 . gpl4 0 . mut . modUS4 ; and, 
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pCMVKm2 . gpl4 0 . mut . modUS4 . delVl / V2 . 

G. Generation of Synthetic Tat Expression Cassettes 

Tat coding sequences have also been modified according 
5 to the teachings of the present specification. The wild 

type nucleotide sequence encoding tat from variant SF162 is 
presented in Figure 76 (SEQ ID NO: 85) . The corresponding 
wild-type amino acid sequence is presented in Figure 77 (SEQ 
ID NO:86). Figure 81 (SEQ ID NO:89) shows the nucleotide 

10 sequence encoding the amino terminal of the tat protein and 
the codon encoding cystein-22 is underlined. Other 
exemplary constructs encoding synthetic tat polypeptides are 
shown in Figures 78 and 79 (SEQ ID NOs:87 and 88) . In one 
embodiment (SEQ ID NO:88), the cystein residue at position 

15 22 is replaced by a glycine. Caputo et al . (1996) Gene 
Therapy 3:235 have shown that this mutation affects the 
trans activation domain of Tat. 

Various forms of the different embodiments of the 
2 0 invention, described herein, may be combined. 

H. Deposit of Vectors 

Selected exemplary constructs shown below and described 
herein are deposited at Chiron Corporation, Emeryville, CA, 
25 94662-8097, and were sent to the American Type Culture 

Collection, 10801 University Boulevard, Manassas, VA 20110- 
2209 on December 27, 1999. 
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Plasmid Name Chiron Date Sent 

Deposit # to ATCC 





pCMVgpl 6 0 . modUS4 


5094 


27 


Dec 


99 




pCMVgpl60delI .modUS4 


5095 


27 


Dec 


99 




pCMVgpl60del2 .modUS4 


5096 


27 


Dec 


99 


5 


pCMVgpl60del-2 . modUS4 


5097 


27 


Dec 


99 




pCMVgpl60dell28-194 .mod.US4 


5098 


27 


Dec 


99 




pCMVgpl 4 Omut . modUS4 dell28-194 


5100 


27 


Dec 


99 




pCMVgpl 4 0 . mut . mod . US 


5101 


27 


Dec 


99 




pCMVgpl 6 0 . modSFl 62 


5125 


27 


Dec 


99 


8 io 


pCMVgpl60 . modSF162 . delV2 


5126 


27 


Dec 


99 




pCMVgpl60 .modSF162 .delVlV2 


5127 


27 


Dec 


99 




pCMVgpl4 0 . mut . modSF162delV2 


5128 


27 


Dec 


99 




pCMVgpl4 0 .mut7.modSF162 


5129 


27 


Dec 


99 




pCMVgpl40 .mut 7 .modSF162delV2 


5130 


27 


Dec 


99 




pCMVgpl40 .mut 8 .modSF162 


5131 


27 


Dec 


99 




pCMVgpl40 .mut 8 .modSF162delV2 


5132 


27 


Dec 


99 




pCMVgpl40 .mut 8 .modSF162delVlV2 


5133 


27 


Dec 


99 




pCMVKm2 . Gagprot . Mod . SF2 . GP1 


5150 


27 


Dec 


99 




pCMVKm2 . Gagprot . Mod . SF2 . GP2 


5151 


27 


Dec 


99 



2 0 



Example 2 
Expression Assays for the 
Synthetic Gag, Env and Tat Coding Sequences 
2 5 A^ Gag and Gag- Protease Coding Sequences 

The HIV-1SF2 wild-type Gag (SEQ ID N0:1) and Gag- 
protease (SEQ ID NO:2) sequences were cloned into expression 
vectors having the same features as the vectors into which 
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the synthetic Gag (SEQ ID NO: 4) and Gag -protease (SEQ ID 
NOs:5, 78 or 79)) sequences were cloned. 

Expression efficiencies for various vectors carrying 
the HIV-1SF2 wild- type and synthetic Gag sequences were 
5 evaluated as follows. Cells from several mammalian cell 
lines (293, RD, COS- 7, and CHO; all obtained from the 
American Type Culture Collection, 10801 University 
Boulevard, Manassas, VA 20110-2209) were transfected with 2 
//g of DNA in transfection reagent LT1 (PanVera Corporation, 

10 545 Science Dr., Madison, WI) . The cells were incubated for 
5 hours in reduced serum medium (Opti-MEM, Gibco-BRL, 
Gaithersburg, MD) . The medium was then replaced with normal 
medium as follows: 293 cells, IMDM, 10% fetal calf serum, 2% 
glutamine (BioWhittaker, Walkersville, MD) ; RD and COS-7 

15 cells, D-MEM, 10% fetal calf serum, 2% glutamine (Opti-MEM, 
Gibco-BRL, Gaithersburg, MD) ; and CHO cells, Ham's F-12, 10% 
fetal calf serum, 2% glutamine (Opti-MEM, Gibco-BRL, 
Gaithersburg, MD) . The cells were incubated for either 48 
or 60 hours. Supernatants were harvested and filtered 

20 through 0.45 fim syringe filters and, optionally, stored at - 
20°C. 

Supernatants were evaluated using the Coulter p24 -assay 
(Coulter Corporation, Hialeah, FL, US) , using 96 -well plates 
coated with a murine monoclonal antibody directed against 

2 5 HIV core antigen. The HIV-1 p24 antigen binds to the coated 

wells. Biotinylated antibodies against HIV recognize the 
bound p24 antigen. Conjugated strepavidin-horseradish 
peroxidase reacts with the biotin. Color develops from the 
reaction of peroxidase with TMB substrate. The reaction is 

3 0 terminated by addition of 4N H 2 S0 4 . The intensity of the 

color is directly proportional to the amount of HIV p24 
antigen in a sample. 
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The results of these expression assays are presented in 
Tables 2A and 2B. Tables 2A and 2B shows data obtained 
using the synthetic Gag-protease expression cassette of SEQ 
ID NO: 5. Similar results were obtained using the Gag- 
protease expression cassettes of SEQ ID NOs:78 and 79. 
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Table 2: in vitro gag and gagprot p24 expression 



TABLE 2a. Increased in vitro expression from modified vs. native gag 
5 plasmids in supernatants and lysates from transiently transfected cells 





experiment 


native (nat) a 
modified (mod) b 


supernatant (sup) 
lysate (lys) 


cell line 


hours post 
rransiecnon 


total ng p24 
(fold increase) 




i 


nat 


sup 


293 


48 


3.4 






mod 


sup 


293 


48 


1260(371) 






nat 


sup 


293 


60 


3.2 






mod 


sup 


293 


60 


2222 (694) 




2 


nat 


sup 


293 


60 


1.8 






mod 


sup 


293 


60 


1740 (966) 




3 


nat 


sup 


293 


60 


1.8 


U1 




mod 


sup 


293 


60 


580 (322) 


s 


4 


nat 


lys 


293 


60 


1.5 


m° 

P' 




mod 


lys 


293 


60 


85 (57) 




1 


nat 


sup 


RD 


48 


5.6 






mod 


sup 


RD 


48 


66 (12) 






nat 


sup 


RD 


60 


7.8 






mod 


sup 


RD 


60 


70.2 (9) 




2 


nat 


lys 


RD 


60 


1.9 






mod 


lys 


RD 


60 


7.8(4) 




1 


nat 


sup 


COS-7 


48 


0.4 






mod 


sup 


COS-7 


48 


33.4 (84) 




2 


nat 


sup 


COS-7 


48 


0.4 






mod 


sup 


COS-7 


48 


10 (25) 






nat 


lys 


COS-7 


48 


3 






mod 


lys 


COS-7 


48 


14(5) 



pCMVLink . Gag . SF2 . PRE 
pCMVKm2 . GagMod . SF2 
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TABLE 2b. In vitro expression from modified gag and gagprotease 
plasmids in supernatants and lysates from transiently transfected 

cells 



T\ 1 CI CTYI1 n 


supernatant (sup) 
lysate (lys) 


cell line 


hours post 
transfection 


total ng p24 d 


vjag 


CUT*! 

oUp 


293 


60 


760 


GagProt(GPl) b 


sup 


293 


60 


380 


GagProt(GP2) c 


sup 


293 


60 


320 


Gag 


lys 


293 


60 


78 


GagProt(GPl) 


lys 


293 


60 


1250 


GagProt(GP2) 


lys 


293 


60 


400 


Gag 


sup 


COS-7 


72 


40 


GagProt(GPl) 


sup 


COS-7 


72 


150 


GagProt(GP2) 


sup 


COS-7 


72 


290 


Gag 


lys 


COS-7 


72 


60 


GagProt(GPl) 


lys 


COS-7 


72 


63 


GagProt(GP2) 


lys 


COS-7 


72 


58 



pCMVKm2 . GagMod . SF2 

b pCMVKm2 .GagProtMod.SF2 (GP1) gagprotease with codon optimization 
and inactivation of INS in protease 

c pCMVKm2.GagProtMod.SF2 (GP2) gagprotease with only inactivation 
of INS in protease 

a Shown are representative results from 3 independent experiments for 
each cell line tested. 
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The data showed that the synthetic Gag and Gag-protease 
expression cassettes provided dramatic increases in 
production of their protein products, relative to the native 
(HIV-1SP2 wild- type) sequences, when expressed in a variety 
of cell lines. 

B. Env Coding Sequences 

The HIV-SF162 ("SF162") wild-type Env (SEQ ID NO: 1-3) 
and HIV-US4 ( "US4 " ) wild- type Env (SEQ ID NO: 22-24) 
sequences were cloned into expression vectors having the 
same features as the vectors into which the synthetic Env 
sequences were cloned. 

Expression efficiencies for various vectors carrying 
the SF162 and US 4 wild-type and synthetic Env sequences were 
evaluated essentially as described above for Gag except that 
cell lysates were prepared in 40 lysis buffer (1.0 % 
NP40, 0.1 M Tris pH 7.5) and frozen at -2 0°C and capture 
ELISAs were performed as follows. 

For Capture ELISAs, 250 ng of an ammonium sulfate IgG 
cut of goat polyclonal antibody to gpl20SF2/env2-3 was used 
to coat each well of a 96-well plate (Corning, Corning, NY) . 
Serial dilutions of gpl20/SF2 protein (MID 167) were used to 
set the quantitation curve from which expression of US 4 or 
SF162 gpl20 proteins from transfection supernatant and 
lysates were calculated. Samples were screened undiluted 
and, optionally, by serial 2 -fold dilutions. A human 
polyclonal antibody to HIV-1 gpl2 0/SF2 was used to detect 
bound gpl2 0 envelope protein, followed by horse-radish 
peroxidase (HRP) -labeled goat anti-human IgG conjugates. 
TMB (Pierce, Rockford, IL) was used as the substrate and the 
reaction is terminated by addition of 4N H 2 S0 4 . The 
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reaction was quantified by measuring the optical density 
(OD) at 450 nm. The intensity of the color is directly 
proportional to the amount of HIV gpl20 antigen in a sample. 
Purified SF2 gpl20 protein was diluted and used as a 
standard. 

The results of the transient expression assays are 
presented in Tables 3 and 4. Table 3 depicts transient 
expression in 293 cells transfected with a pCMVKm2 vector 
carrying the Env cassette of interest. Table 4 depicts 
transient expression in RD cells transfected with a pCMVKm2 
vector carrying the Env cassette of interest. 
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Table 4 



CHO Cell Lines Expression Level of US4 Envelope 




Constructs 




Constructs 


CHO Clone # 


MTX 


Expression Level* 






Level 


(ng/ral ) 


CTD12 0 modUS4 


1 








2 


1.6/^iM 






3 


200nM 


230-580 




4 


200nM 


300-500 


crol40 .modUS4 


1 








2 


1/zM 


100-260 




3 


luM 
» 


200-430 

^ w w n —J v./ 


gpl40 .raut . 


1 


1/xM 


1 1 0-970 


modUS4 


2 




J. *J V J *J 




3 


1/zM 


100-220 


gpl40.modUS4 


1 


50nM 


313-587" 


.delVl/V2 


2 


50nM 


237-667" 




3 


50nM 


492-527" 


gpl40.mut. 


1 


50nM 


46-328" 


modUS4.delVl 


2 


50nM 


82-318" 


/V2 


3 


50nM 


204-385" 



*A11 samples measured at T-75 flask stage unless otherwise 
indicated 

**at 24 well and 6 well plate stages 

***in a three liter bioreactor perfusion culture this clone 
yielded approximately 2-5 fiq /ml. 
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The data showed that the synthetic Env and expression 
cassettes provided a significant increase in production of 
their protein products, relative to the native (HIV-1SF162 
or US 4 wild-type) sequences, when expressed in a variety of 
5 cell lines. 

CL CHO Cell line Env expression data 

Chinese hamster ovary (CHO) cells were transfected with 
plasmid DNA encoding the synthetic HIV-1 gpl2 0 or gpl4 0 

10 proteins (e.g., pESN2dhfr or pCMVIII vector backbone) using 
Mirus Trans IT -LT1 polyamine transfection reagent (Pan Vera) 
according to the manufacturers instructions and incubated 
for 96 hours. After 96 hours, media was changed to 
selective media (F12 special with 250 /zg/ml G418) and cells 

15 were split 1:5 and incubated for an additional 48 hours. 
Media was changed every 5-7 days until colonies started 
forming at which time the colonies were picked, plated into 
96 well plates and screened by gpl20 Capture ELISA. 
Positive clones were expanded in 24 well plates and screened 

20 several times for Env protein production by Capture ELISA, 
as described above. After reaching confluency in 24 well 
plates, positive clones were expanded to T25 flasks 
(Corning, Corning, NY) . These were screened several times 
after confluency and positive clones were expanded to T75 

25 flasks. 

Positive T75 clones were frozen in LN2 and the highest 
expressing clones amplified with 0-5 /xM methotrexate (MTX) at 
several concentrations and plated in 100mm culture dishes. 
Plates were screened for colony formation and all positive 
30 closed were again expanded as described above. Clones were 
expanded an amplified and screened at each step by gpl2 0 
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capture ELISA. Positive clones were frozen at each 
methotrexate level . Highest producing clones were grown in 
perfusion bioreactors (3L, 100L) for expansion and 
adaptation to low serum suspension culture conditions for 
scale-up to larger bioreactors. 

Tables 5 and 6 show Capture ELISA data from CHO cells 
transfected with pCMVIII vector carrying a cassette encoding 
synthetic HIV-US4 and SF162 Env polypeptides (e.g., mutated 
cleavage sites, modified codon usage and/or deleted 
hypervariable regions) . Thus, stably transfected CHO cell 
lines which express Env polypeptides (e.g., gpl20 / gpl40- 
monomeric, and gpl4 0-oligomeric) have been produced. 
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Table 5 



CHO Cell Lines Expression Level of US4 Envelope 




Constructs 




Constructs 


CHO Clone # 


MTX 


Expression Level* 






Level 


(ng/ml) 


gpl20 .modUS4 


1 


3 .2/xM 


250-450 




2 


1.6/zM 


350-450 




3 


200nM 


230-580*** 




4 


200nM 


300-500 


gpl40 .modUS4 


1 


IfiM 


155-300 




2 




100-260 




3 


1/XM 


200-430 


gpl40 .mut . 


1 


1/lM 


110-270 


modUS 4 


2 


1/xM 


100-235 




3 


1/xM 


100-220 


gpl40 .modUS4 


1 


50nM 


313-587** 


.delVl/V2 


2 


50nM 


237-667** 




3 


50nM 


492-527** 


gpl40 .mut . 


1 


50nM 


46-328** 


modUS 4 .del VI 


2 


50nM 


82-318** 


/V2 


3 


50nM 


204-385** 



*A11 samples measured at T-75 flask stage unless otherwise 



15 indicated 

**at 24 well and 6 well plate stages 

***in a three liter bioreactor perfusion culture this clone 
yielded approximately 2-5 [xg/val . 
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Table 6 



'SSIS? 



10 



15 



CHO Cell Lines Expression 


Level of 


SF162 Envelope 




Constructs 




Constructs 


CHO Clone # 


MTX 


Expression Level* 






Level 


(ng/ml) 


gpl2 0 .modSF162 


1 


0 


755-2705 




2 


0 


928-1538 




3 


0 


538-1609 


gpl40.modSF162 


1 


20 nM 


180-350 


gpl4 0 .mut . 


1 


20 nM 


164-451 


modSF162 


2 


20 nM 


188-487 




3 


20 nM 


233-804 


gpl2 0 .modSF162 


1 


800nM 


528-1560 


. delV2 


2 


800nM 


487-1878 




3 


800nM 


589-1212 


gpl40 .modSF162 


1 


800nM 


300-600 


. delV2 


2 


800nM 


200-400 




3 


800nM 


200-500 


gpl4 0 .mut . 


1 


800nM 


300-700 


modSF162 .delV2 


2 


400nM 


1161 




3 


800nM 


400-600 




4 


400nM 


1600-2176 



All samples measured at T-75 flask stage unless otherwise 
indicated 



The results presented above demonstrate the ability of 
2 0 the constructs of the present invention to provide 

expression of Env polypeptides in CHO cells. Production of 
polypeptides using CHO cells provides (i) correct 
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glycosylation patterns and protein conformation (as 
determined by binding to panel of MAbs) ; (ii) correct 
binding to CD4 receptor molecules; (iii) absence of non- 
mammalian cell contaminants (e.g., insect viruses and/or 
5 cells) ; and (iv) ease of purification. 

D. Tat Coding Sequences 

The HIV-SF162 ("SF162") wild- type Tat (SEQ ID NO: 85) 

sequences were cloned into expression vectors having the 
0 same features as the vectors into which the synthetic Tat 

sequences were cloned (SEQ ID NOs:87, 88 and 89). 

Expression efficiencies for various vectors carrying 

the SF162 wild-type and synthetic Tat sequences are 

evaluated essentially as described above for Gag and Env 
5 using capture ELISAs with the appropriate anti-tat 

antibodies and/or CHO cell assays. Expression of the 

polypeptides encoded by the synthetic cassettes is improved 

relative to wild type. 



20 Example 3 

Western Blot Analysis of Expression 
L. Gag and Gag- Protease Coding Sequences 

Human 293 cells were transfected as described in 
Example 2 with pCMV6a-based vectors containing native or 

25 synthetic Gag expression cassettes. Cells were cultivated 
for 60 hours post-transf ection. Supernatants were prepared 
as described. Cell lysates were prepared as follows. The 
cells were washed once with phosphate-buffered saline, lysed 
with detergent [1% NP40 (Sigma Chemical Co., St. Louis, MO) 

30 in 0.1 M Tris-HCl, pH 7.5], and the lysate transferred into 
fresh tubes. SDS-polyacrylamide gels (pre-cast 8-16%; 
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Novex, San Diego, CA) were loaded with 20 //l of supernatant 
or 12.5 [xl of cell lysate . A protein standard was also 
loaded (5 [xl , broad size range standard; BioRad 
Laboratories, Hercules, CA) . Electrophoresis was carried 
5 out and the proteins were transferred using a BioRad 

Transfer Chamber (BioRad Laboratories, Hercules, CA) to 
Immobilon P membranes (Millipore Corp., Bedford, MA) using 
the transfer buffer recommended by the manufacturer 
(Millipore) , where the transfer was performed at 100 volts 
10 for 90 minutes. The membranes were exposed to HIV-1- 

positive human patient serum and immunostained using o- 
phenylenediamine dihydrochloride (OPD; Sigma) . 
Jg: The results of the immunoblotting analysis showed that 

?1 cells containing the synthetic Gag expression cassette 

IP 15 produced the expected p55 protein at higher per-cell 
fZ concentrations than cells containing the native expression 

$ cassette. The Gag p55 protein was seen in both cell lysates 

^ and supernatants. The levels of production were 

m significantly higher in cell supernatants for cells 

^ 2 0 transfected with the synthetic Gag expression cassette of 
S the present invention. Experiments performed in support of 

the present invention suggest that cells containing the 
synthetic Gag-prot expression cassette produced the expected 
Gag-prot protein at comparably higher per-cell 
25 concentrations than cells containing the native expression 
cassette . 

In addition, supernatants from the transfected 293 
cells were fractionated on sucrose gradients. Aliquots of 
the supernatant were transferred to Polyclear™ ultra- 
3 0 centrifuge tubes (Beckman Instruments, Columbia, MD) , under- 
laid with a solution of 2 0% (wt/wt) sucrose, and subjected 
to 2 hours centrifugation at 28,000 rpm in a Beckman SW28 
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rotor. The resulting pellet was suspended in PBS and 
layered onto a 20-60% (wt/wt) sucrose gradient and subjected 
to 2 hours centrifugation at 40,000 rpm in a Beckman SW41ti 
rotor . 

5 The gradient was then fractionated into approximately 

10 x 1 ml aliquots (starting at the top, 20%-end, of the 
gradient) . Samples were taken from fractions 1-9 and were 
electrophoresed on 8-16% SDS polyacrylamide gels. Fraction 
number 4 (the peak fraction) corresponds to the expected 

10 density of Gag protein VLPs . The supernatants from 

2 93 /synthetic Gag cells gave much stronger p55 bands than 
supernatants from 2 93/native Gag cells, and, as expected, 
the highest concentration of p55 in either supernatant was 
found in fraction 4. 

15 These results demonstrate that the synthetic Gag 

expression cassette provides superior production of both p55 
protein and VLPs, relative to the native Gag coding 
sequences . 

2 0 Bj_ Env Coding Sequences 

Human 293 cells were transfected as described in 
Example 2 with pCMVKm2 -based; pCMVl ink-based; p-CMVII -based 
or pESN2 -based vectors containing native or synthetic Env 
expression cassettes. Cells were cultivated for 48 or 60 
25 hours post-transf ection. Cell lysates and supernatants were 
prepared as described (Example 2) . Briefly, the cells were 
washed once with phosphate-buffered saline, lysed with 
detergent [1% NP40 (Sigma Chemical Co., St. Louis, MO)] in 
0.1 M Tris-HCl, pH 7.5], and the lysate transferred into 

3 0 fresh tubes. SDS -polyacrylamide gels (pre-cast 8-16%; 

Novex, San Diego, CA) were loaded with 2 0 /il of supernatant 
or 12.5 ixl of cell lysate. A protein molecular weight 
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standard and an HIV SF2 gpl20 positive control protein (5 
(il, broad size range standard; BioRad Laboratories, 
Hercules, CA) were also loaded. Electrophoresis was carried 
out and the proteins were transferred using a BioRad 
Transfer Chamber (BioRad Laboratories, Hercules, CA) to 
Immobilon P membranes (Millipore Corp., Bedford, MA) using 
the transfer buffer recommended by the manufacturer 
(Millipore) , where the transfer was performed at 100 volts 
for 90 minutes. The membranes were then reacted against 
polyclonal goat anti-gpl20SF2/env2-3 anti-sera, followed by 
incubation with swine anti-goat IgG-peroxidase (POD) (Sigma, 
St. Louis, MO). Bands indicative of binding were visualized 
by adding DAB with hydrogen peroxide which deposits a brown 
precipitate on the membranes . 

The results of the immunoblotting analysis showed that 
cells containing the synthetic Env expression cassette 
produced the expected Env gp proteins of the predicted 
molecular weights as determined by mobilities in SDS- 
polyacryl amide gels at higher per-cell concentrations than 
cells containing the native expression cassette. The Env 
proteins were seen in both cell lysates and supernatants . 
The levels of production were significantly higher in cell 
supernatants for cells transfected with the synthetic Env 
expression cassette of the present invention. 

C . Tat Coding Sequences 

Human 2 93 cells are transfected as described in Example 
2 with various vectors containing native or synthetic Tat 
expression cassettes. Cells are cultivated and isolated 
proteins analyzed as described above. Immunoblotting 
analysis shows that cells containing the synthetic Tat 
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expression cassette produced the expected Tat proteins of 
the predicted molecular weights as determined by mobilities 
in SDS-polyacrylamide gels at higher per-cell concentrations 
than cells containing the native expression cassette. 

5 

Example 4 
Purification of Env polypeptides 
A. Purification of Oligomeric gpl40 

Purification of oligomeric gpl40 (o-gp!40 US4) was 
10 conducted essentially as shown in Figure 60. For the 

experiments described herein, o-gpl4 0 refers to oligomeric 
"J gpl40 in either native or modified (e.g., optimized 

«p expression sequences, deleted, mutated, truncated, etc.) 

,J form. Briefly, concentrated (30-50X) supernatants obtained 

111 15 from CHO cell cultures were loaded onto an anion exchange 
jjpg (DEAE) column which removed DNA and other serum proteins, 

s The eluted material was loaded onto a ceramic hydroxyapatite 

fl§ column (CHAP) which bound serum proteins but not HIV Env 

W proteins. The flow-through from the DEAE and CHAP columns 

J; 2 0 was loaded onto a Protein A column as a precautionary step 
^ to remove any remaining serum immunoglobulins. The Env 

proteins in the flow- through were then captured using the 
lectin gluvanthus naval is (GNA, Vector Labs, Burlingame, 
CA) . GNA has high affinity for mannose rich carbohydrates 
2 5 such as Env. The Env proteins were then eluted with GNA 

substrate. To remove other highly glycosylated proteins, a 
cation exchange column (SP) was used to purify gpl40/gpl20. 
In a final step, which separates gpl20 from o-gpl40, a gel 
filtration column was used to separate oligomers from 
30 monomers. Sizing and chromatography analysis of the final 
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product revealed that this strategy lead to the successful 
isolation of oligomeric gpl40. 

B. Purification of gp!2 0 
5 Purification of gpl2 0 was conducted essentially as 

previously described for other Env proteins. Briefly., 
concentrated supernatants obtained from CHO cell cultures 
were loaded onto an anion exchange (DEAE) column which 
removed DNA and other serum proteins. The eluted material 

10 was loaded onto a ceramic hydroxyapatite column (CHAP) which 
bound serum proteins but not HIV Env proteins. The flow- 
through from the CHAP column was loaded a cation exchange 
column (SP) where the flow-through was discarded and the 
bound fraction eluted with salt. The eluted fraction (s) 

15 were loaded onto a Suprose 12/Superdex 2 00 Tandem column 
(Pharmacia -Upjohn, Uppsala, Sweden) from which purified 
gpl20 was obtained. Sizing and chromatography analysis of 
the final product revealed that this strategy successfully 
purified gpl20 proteins. 

20 

Example 5 

Analysis of Purified Env Polypeptides 
A. Analysis of o-qpl4Q 

It is well documented that HIV Env protein binds to CD4 

25 only in its correct conformation. Accordingly, the ability 
of o-gpl40 US4 polypeptides, produced and purified as 
described above, to bind CD 4 cells was tested. O-gpl4 0 US 4 
was incubated for 15 minutes with FITC- labeled CD4 at room 
temperature and loaded onto a Biosil 250 (BioRad) size 

3 0 exclusion column using Waters HPLC. CD4-FITC has the longest 
retention time (2.67 minutes), followed by CD4-FITC-gpl20 
(2.167 min) . The shortest retention time (1.9 min) was 
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observed for CD4-FITC-o-gpl40 US 4 indicating that, as 
expected, o-gpl4 0 US 4 binds to CD4 forming a large complex 
which reduces retention time on the column. Thus, the o- 
gpl40 US4 produced and purified as described above is of the 
5 correct size and conformation. 

In addition, the US4 o-gpl40, purified as described 
above, was also tested for its ability to bind to a variety 
of monoclonal antibodies with known epitope specificities 
for the CD4 binding site, the CD4 inducible site, the V3 
10 loop and oligomer- specif ic gp41 epitope. O-gpl40 bound 

strongly to these antibodies, indicating that the purified 
protein retains its structural integrity. 

B. Analysis of gpl2 0 

15 As described above, CD4-FITC binds gpl2 0, as 

demonstrated by the decreased retention time on the HPLC 
column. Thus, US4 gpl2 0 purified by the above method 
retains its conformational integrity. In addition, the 
properties of purified gpl2 0 can be tested by examining its 

2 0 integrity and identity on western blots, as well as, by 

examining protein concentration, pH, conductivity, endotoxin 
levels, bioburden and the like. US4 gpl20, purified as 
described above, was also tested for its ability to bind to 
a variety of monoclonal antibodies with known epitope 

2 5 spe'cif icities for the CD4 binding site, the CD4 inducible 
site, the V3 loop and oligomer-specif ic gp41 epitope. The 
pattern of mAb binding to gpl20 indicated that the purified 
protein retained its structural integrity, for example, the 
purified gpl2 0 did not bind the mAb having the oligomer- 

30 specific gp41 epitope (as expected) . 
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Example 6 

Electron Microscopic Evaluation of VLP Production 

The cells for electron microscopy were plated at a 
density of 50-70% confluence, one day before transf ection. 
The cells were transfected with 10 /xg of DNA using 
transfection reagent LT1 (Panvera) and incubated for 5 hours 
in serum- reduced medium (see Example 2) . The medium was 
then replaced with normal medium (see Example 2) and the 
cells were incubated for 14 hours (COS-7) or 40 hours (CHO) . 
After incubation the cells were washed twice with PBS and 
fixed with 2% glutaraldehyde . Electron microscopy was 
performed by Prof. T.S. Benedict Yen, Veterans Affairs, 
Medical Center, San Francisco, CA) . 

Electron microscopy was carried out using a 
transmission electron microscope (Zeiss 10c) . The cells 
were pre- stained with osmium and stained with uranium 
acetate and lead citrate. The magnification was 100,000X. 

Figures 3A and 3B show micrographs of CHO cells 
transfected with pCMVKM2 carrying the synthetic Gag 
expression cassette (SEQ ID NO: 5) or carrying the Gag-prot 
expression cassette (SEQ ID NO: 79) . In the figure, free and 
budding immature virus-like-particles (VLP) of the expected 
size (100 nm) are seen for the Gag expression cassette 
(Figure 3A) and both immature and mature VLPs are seen for 
the Gag-prot expression cassette (Figure 3B) . COS-7 cells 
transfected with the same vector have the same expression 
pattern. VLP can also be found intracellularly in CHO and 
COS-7 cells. 

Native and synthetic Gag expression cassettes were 
compared for their associated levels of VLP production when 
used to transf ect human 293 cells. The comparison was 
performed by density gradient ultracentrif ugation of cell 
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supernatants and Western-blot analysis of the gradient 
fractions. There was a clear improvement in production of 
VLPs when using the synthetic Gag construct. 

Example 7 

Expression of Virus-like Particles in the Baculovirus System 
A. Expression of Native HIV p55 Gag 

To construct the native HIV p55 Gag baculovirus shuttle 
vector, the prototype SF2 HIV p55 plasmid, pTMl-Gag (Selby 
M.J., et al., J Virol. 71 (10) : 7827-7831 , 1997), was digested 
with restriction endonucleases JVcol and EaznHI to extract a 
1.5 Kb fragment that was subsequently subcloned into pAcC4 
(Bio /Technology 6:41-55, 1988), a derivative of pAc436. 
Generation of the recombinant baculovirus was achieved by 
co-transf ecting 2 jig of the HIV p55 Gag pAcC4 shuttle vector 
with 0.5 fxg of linearized, Autographs, calif ornica 
baculovirus (AcNPV) wild- type viral DNA into Spodoptera 
frugiperda (Sf9) cells (Kitts, P. A., Ayres M.D., and Possee 
R.D., Nucleic Acids Res. 18:5667-5672, 1990). The isolation 
of recombinant virus expressing HIV p55 Gag was performed 
according to standard techniques (O'Reilly, D.R., L.K. 
Miller, and V. A. Luckow, Baculovirus Expression Vector: A 
Laboratory Manual, W.H. Freeman and Company, New York, 
1992) . 

Expression of the HIV p55 Gag was achieved using a 500 
ml suspension culture of Sf9 cells grown in serum-free 
medium (Miaorella, B., D. Inlow, A. Shauger, and D. Harano, 
Bio/Technology 6:1506-1510, 1988) that had been infected 
with the HIV p55 Gag recombinant baculovirus at a 
multiplicity of infection (MOI) of 10. Forty-eight hours 
post -infect ion, the supernatant was separated by 
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centrifugation and filtered through a 0.2 /zm filter. 
Aliquot s of the supernatant were then transferred to 
Polyclear™ (Beckman Instruments, Palo Alto, CA) 
ultracentrifuge tubes, underlaid with 2 0% (wt/wt) sucrose, 
5 and subjected to 2 hours centrifugation at 24,00 rpm using a 
Beckman SW28 rotor. 

The resulting pellet was suspended in Tris buffer (20 
mM Tris HCl, pH 7.5, 250 mM NaCl , and 2.5 mM 
ethylenediaminetetraacetic acid [EDTA] ) , layered onto a 20- 

10 60% (wt/wt) sucrose gradient, and subjected to 2 hours 

centrifugation at 40,000 rpm using a Beckman SW41ti rotor. 
The gradient was then fractionated starting at the top (20% 
sucrose) of the gradient into approximately twelve 0.75 ml 
aliquots. A sample of each fraction was electrophoresed on 

15 8-16% SDS polyacryl amide gels and the resulting bands were 
visualized after commassie staining (Figure 4) . Additional 
aliquots were subjected to refractive index analysis. 

The results shown in Figure 4 indicated that the p55 
Gag virus-like particles banded at a sucrose density of 

20 range of 1.15 - 1.19 g/ml with the peak at approximately 

1.17 g/ml. The peak fractions were pooled and concentrated 
by a second 20% sucrose pelleting. The resulting pellet was 
suspended in 1 ml of Tris buffer (described above) . The 
total protein yield as estimated by Bicimchrominic Acid 

25 (BCA) (Pierce Chemical, Rockford, IL) was 1.6 mg. 

B . Expression of Synthetic HIV p55 Gag 

A baculovirus shuttle vector containing the synthetic 
p55 Gag sequence was constructed as follows. The synthetic 
3 0 HIV p55 expression cassette (Example 1) was digested with 
restriction enzyme Sail followed by incubation with T4-DNA 
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polymerase. The resulting fragment was isolated (PCR Clean- 
up™, Promega, Madison, WI) and then digested with BamHI 
endonuclease. The shuttle vector pAcCl3 (Munemitsu S., et 
al., Mol Cell Biol. 10 (11) : 5977-5982 , 1990) was linearized 
5 by digestion with Ecol, followed by incubation with T4-DNA 
polymerase, and then isolated (PCR Clean-Up™) . The , 
linearized vector was digested with Baxtill, treated with 
alkaline phosphatase, and isolated by size fragmentation in 
an agarose gel. The isolated 1.5 kb fragment was ligated 
10 with the prepared pAcC13 vector. The resulting clone was 
designated pAcC13-Modif .p55Gag. 
;^ The expression conditions for the synthetic HIV p55 

£ VLPs differed from those of the native p55 Gag as follows: 

1^ a culture volume of 1 liter used instead of 500 ml; 

Ul 15 Trichoplusia ni (Tn5) (Wickham, T.J., and Nermerow, G.R., 
jr. BioTechnology Progress, 9:25-30, 1993) insect cells were 

f. used instead of Sf9 insect cells; and, an MOI of 3 was 

flf instead of an MOI of 10. Experiments performed in support 

of the present invention showed that there was no 
i|j 20 appreciable difference in expression level between the Sf9 
^ and Tn5 insect cells with the native p55 clone. In terms of 

MOI, experience with the native p55 clone suggested that an 
MOI of 10 resulted in higher expression (approximately 2- 
f old) of VLPs than a lower MOI . 
2 5 The sucrose pelleting and banding methods used for the 

synthetic p55 VLPs were similar to those employed for the 
native p55 VLPs (described above) , with the following 
exceptions: pelleted VLPs were suspended in 4 ml of 
phosphate buffered saline (PBS) instead, of 1.0 ml of the 
30 Tris buffer; and four, 20-60% sucrose gradients were used 
instead of a single gradient. Also, due to the high 
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concentration of banded VLPs, further concentration by 
pelleting was not required. The peak fractions from all 4 
gradients were simply dialyzed against PBS . The approximate 
density of the banded VLPs ranged from 1.23-1.28 g/ml . A 
total protein yield as estimated by BCA was 46 mg. Results 
from the sucrose gradient banding of the synthetic p55 are 
shown in Figure 5 . 

A comparison of the total amount of purified HIV p55 
Gag from several preparations obtained from the two 
baculovirus expression cassettes has been summarized in 
Figure 6. The average yield from the native p55 was 3.16 
mg/liter of culture (n=5, standard deviation (sd) ±1.07, 
range = 1.8-4.8 mg/L) whereas the average yield from the 
synthetic p55 was more than ten-fold higher at 44.5 mg/liter 
of culture (n=2, sd=±6.4). 

In addition to a higher total protein yield, the final 
product from the synthetic p55 -expressed Gag consistently 
contained lower amounts of contaminating baculovirus 
proteins than the final product from the native p55- 
expressed Gag. This difference can be seen in the two 
commassie-stained gels Figures 4 and 5. 

C. Expression of Native and Synthetic Gag-Core 

Expression of the HIV p55 Gag/HCV Core 173 (SEQ ID 
NO:8) was achieved using a 2.5 liter suspension culture of 
Sf9 cells grown in serum-free medium (Miaorella, B . , D. 
Inlow, A. Shauger, and D. Harano. 1988 Bio/Technology 
6:1506-1510) . The cells were infected with an HIV p55 
Gag/HCV Core 173 recombinant baculovirus. Forty-eight hours 
post -infection, the supernatant was separated from the cells 
by centrifugation and filtered through a 0.2 jim filter. 
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Aliquot s of the supernatant were then transferred to a 
Polyclear™ (Beckman Instruments, Palo Alto, CA) 
ultracentrif uge tubes containing 3 0% (wt/wt) sucrose, and 
subjected to 2 hours of centrif ugation at 24,000 rpm in a 
5 Beckman SW2 8 rotor and ultracentrif uge . 

The resulting pellet was suspended in Tris buffer (50 
mM Tris-HCl, pH 7.5, 500 mM NaCl) and layered onto a 30-60% 
(wt/wt) sucrose gradient and subjected to 2 hours 
centrif ugation at 40,000 rpm in a Beckman SW41ti rotor and 

10 ultracentrif uge . The gradient was then fractionated 

starting at the top (30%) of the gradient into approximately 
11 x 1.0 ml aliquots. A sample of each fraction was 
electrophoresed on 8-16% SDS polyacrylamide gels and the 
resulting bands were visualized after commassie staining. 

15 A subset of aliquots were also subjected to Western 

blot analysis using monoclonal antibody 76C.5EG (Steimer, 
K.S., et al., Virology 150:283-290, 1986) which is specific 
for HIV p24 (a subunit of HIV p55) . The peak fractions from 
the sucrose gradient were pooled and concentrated by a 

20 second 20% sucrose pelleting. The resulting pellet was 
suspended in 1 ml of buffer Tris buffer and the total 
protein yield as estimated by BCA (Pierce Chemical, 
Rockford, IL) was ~ 1.0 mg. 

The results from the SDS PAGE are shown in Figure 8 and 

25 the anti- p24 Western blot results are shown in Figure 9. 
Taken together, these results indicate that the HIV p55 
Gag/HCV Core 173 chimeric VLPs banded at a sucrose density 
similar to that of the HIV p55 Gag VLPs and the visible 
protein band that migrated at a molecular weight of ~ 72,000 

3 0 kd was reactive with the HIV p24- specific monoclonal 
antibody. An additional immunoreactive band at 
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approximately 55,000 kd also appeared to be reactive with 
the anti-p24 antibody and may be a degradation product. 

Although aliquot s from the above preparation were not 
tested for reactivity with an HCV Core- specif ic antibody (an 
5 anti-CD22 rabbit serum) , results from a similar preparation 
are shown in Figure 10 and indicate that the main HCV Core- 
specific reactivity migrates at an approximate molecular 
weight of 72,000 kd which is in accordance with the 
predicted molecular weight of the chimeric protein. 

10 The expression conditions for the synthetic HIV p55 

Gag/HCV Core 173 (SEQ ID NO: 8) VLPs differed from those of 
the native p55 Gag and are as follows: a culture volume of 1 
liter used instead of 2.5 liters, Trichoplusia ni 
(Tn5) (Wickham, T.J., and Nemerow, G.R. 1993 BioTechnology 

15 Progress, 9:25-30) insect cells were used instead of Sf9 
insect cells and an MOI of 3 was instead of an MOI of 10. 
The sucrose pelleting and banding methods used for the 
synthetic HIV p55 Gag/HCV Core 173 VLPs were similar to 
those employed for the native HIV p55 Gag/HCV Core 173 VLPs. 

2 0 However, differences included: pelleted VLPs were suspended 
in 1 ml of phosphate buffered saline (PBS) instead of 1.0 ml 
of the Tris buffer, and a single 20-60% sucrose gradients 
was used. A comparison of the total amount of purified HIV 
p55 Gag/HCV Core 173 from multiple preparations obtained 

2 5 from the two baculovirus expression cassettes showed that 

there was an increase in expression using the synthetic HIV 
p55 Gag/HCV Core 173 cassette. 
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D. Alternative method for the enrichment of HIV p55 Gag 
VLPs 

In addition to purification from the media, p55 (Gag 
protein) expressed in baculovirus (e.g., using a synthetic 
5 expression cassette of the present invention) can also be 
purified as virus-like particles from the infected insect 
cells. For example, forty-eight hours post infection, the 
media and cell pellet are separated by centrifugation and 
the cell pellet is stored at -70°C until future use. At the 
10 time of processing, the cell pellet is suspended in 5 
m volumes of hypotonic lysis buffer (20 mM Tris-HCl, pH 8.2, 1 

<f mM EGTA; 1 mM MgCl, and Complete Protease Inhibitor® 

if* 3 

%j (Boehringer Mannheim Corp., Indianapolis, IN]). If needed, 

s'SJ the cells are then dounced 8-10 times to complete cell 

H 15 lysis. 

~" The lysate is then centrifuged at approximately 1000- 

?T S 1500 x g for 20 minutes. The supernatant is decanted into 

yj UltraClear™ tubes, underlayed with 20% sucrose (w/w) and 

Ik centrifuged at 24,000 rpm in SW28 buckets for 2 hours. The 

CP 20 resulting pellet is suspended in Tris buffer (20 mM Tris 
HC1, pH 7.5, 250 mM NaCl, and 2 . 5 mM ethylene -diamine - 
tetraacetic acid (EDTA) with 0.1% IGEPAL detergent (Sigma 
Chemical, St. Louis, MO) and 2 50 units/ml of benzonase 
(American International Chemical, Inc., Natick, MA) and 
25 incubated at 4°C for at least 3 0 minutes. The 

suspension is subsequently layered onto a 20-60% sucrose 
gradient and spun at 40,000 rpm using an SW41ti rotor for 
20-24 hours. 
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After ultracentrifugation, the sucrose gradient is 
fractionated and aliquot s run on SDS PAGE to identify peak 
fractions. The peak fractions are dialyzed against PBS and 
measured for protein content. Negatively stained electron 
5 mircographs typically show non-enveloped VLPs somewhat 
smaller in diameter (80-120 nm) than the budded 
VLPs. HIV Gag VLPs prepared in this manner are also 
capable of generating Gag-specific CTL responses in mice. 

10 Example 8 

In Vivo Immunocrenicity of Synthetic Gag Expression Cassettes 
A. Immunization 

To evaluate the possibly improved immunogenicity of the 
synthetic Gag expression cassettes, a mouse study was 

15 performed. The plasmid DNA, pCMVKM2 carrying the synthetic 
Gag expression cassette, was diluted to the following final 
concentrations in a total injection volume of 100 ptl : 20 
jig, 2 fig, 0.2 fig, and 0.02 /xg. To overcome possible 
negative dilution effects of the diluted DNA, the total DNA 

20 concentration in each sample was brought up to 20 fig using 
the vector (pCMVKM2) alone. As a control, plasmid DNA of 
the native Gag expression cassette was handled in the same 
manner. Twelve groups of four Balb/c mice (Charles River, 
Boston, MA) were intramuscularly immunized (50 fil per leg, 

25 intramuscular injection into the tibialis anterior) 
according to the schedule in Table 7. 
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Table 7 



Group 


Gag Expression 
Cassette 


Concentration 
of Gag plasmid 
DNA (jig) 


Immunized at 
time (weeks) : 


1 


Synthetic 


20 


0 1 , 4 


2 


Synthetic 


2 


0, 4 • 


3 


Synthetic 


0.2 


0, 4 


4 


Synthetic 


0.02 


0, 4 


5 


Synthetic 


20 


0 


6 


Synthetic 


2 


0 


7 


Synthetic 


0.2 


0 


8 


Synthetic 


0.02 


0 


9 


Native 


20 


0 


10 


Native 


2 


0 


11 


Native 


0.2 


0 


12 


Native 


0.02 


0 



1 = initial immunization at "week 0" 



Groups 1-4 were bled at week 0 (before immunization) , 
week 4, week 6, week 8, and week 12. Groups 5-12 were bled 
at week 0 (before immunization) and at week 4. 

2 0 Humoral Immune Response 

The humoral immune response was checked with an anti- 
HIV Gag antibody ELISAs (enzyme -linked immunosorbent assays) 
of the mice sera 0 and 4 weeks post immunization (groups 5- 
12) and, in addition, 6 and 8 weeks post immunization, 

25 respectively, 2 and 4 weeks post second immunization (groups 
1-4) . 

The antibody titers of the sera were determined by 
anti-Gag antibody ELISA. Briefly, sera from immunized mice 
were screened for antibodies directed against the HIV p55 
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Gag protein. EL ISA microtiter plates were coated with 0.2 

of HIV-1 SF2 p24-Gag protein per well overnight and washed 
four times; subsequently, blocking was done with PBS-0.2% 
Tween (Sigma) for 2 hours. After removal of the blocking 
5 solution, 100 fxl of diluted mouse serum was added. Sera 
were tested at 1/25 dilutions and by serial 3-fold 
dilutions, thereafter. Microtiter plates were washed four 
times and incubated with a secondary, peroxidase-coupled 
anti-mouse IgG antibody (Pierce, Rockford, IL) . ELISA 

10 plates were washed and 100 fil of 3, 3', 5, 5 ' -tetramethyl 
benzidine (TMB; Pierce) was added per well. The optical 
density of each well was measured after 15 minutes. The 
titers reported are the reciprocal of the dilution of serum 
that gave a half-maximum optical density (O.D.). The ELISA 

15 results are presented in Table 8. 



Table 8 



Group 


Inoculum 


Expression 


Sera 


Sera 


Sera 




(M9) 


cassette 


Week 4 3 


Week 6 


Week 8 


1 


20 


S 1 - gag 


98 


455 


551 


2 


2 


S - gag 


59 


1408 


227 


3 


0. 


S - gag 


29 


186 


61 


4 


0.02 


S - gag 


< 20 


< 20 


< 20 


5 


20 


S - gag 


67 


n.a. 4 


n.a. 


6 


2 


S - gag 


63 


n.a. 


n.a. 


7 


0. 


S - gag 


57 


n.a. 


n.a. 


8 


0.02 


S - gag 


< 20 


n.a. 


n.a. 


9 


20 


N 2 - gag 


43 


n.a. 


n.a. 


10 


2 


N - gag 


< 20 


n.a. 


n.a. 


11 


0 . 


N - gag 


< 20 


n.a. 


n.a. 


12 


0.02 


N - gag 


< 20 


n.a. 


n.a. 



1 = synthetic gag expression cassette (SEQ ID NO: 4) 

2 = native gag expression cassette (SEQ ID NO: 1) 

3 = geometric mean antibody titer 

4 = not applicable 
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The results of the mouse immunizations with plasmid- 
DNAs show that the synthetic expression cassettes provide a 
clear improvement of immunogenic ity relative to the native 
expression cassettes. Also, the second boost immunization 
5 induced a secondary immune response after two weeks (groups 
1-3) . 

C. Cellular Immune Response 

The frequency of specific cytotoxic T- lymphocytes (CTL) 

10 was evaluated by a standard chromium release assay of 
peptide pulsed Balb/c mouse CD4 cells. Gag expressing 
vaccinia virus infected CD- 8 cells were used as a positive 
control (wGag) . Briefly, spleen cells (Effector cells, E) 
were obtained from the BALB/c mice immunized as described 

15 above (Table 8) were cultured, restimulated, and assayed for 
CTL activity against Gag peptide-pulsed target cells as 
described (Doe, B., and Walker, CM., AIDS 10 (7) : 793-794 , 
1996) . The HIV-1 SF2 Gag peptide used was p7g SEQ ID NO:10. 
Cytotoxic activity was measured in a standard 51 Cr release 

20 assay. Target (T) cells were cultured with effector (E) 

cells at various E:T ratios for 4 hours and the average cpm 
from duplicate wells was used to calculate percent specific 
51 Cr release. The results are presented in Table 9. 
Cytotoxic T-cell (CTL) activity was measured in 

2 5 splenocytes recovered from the mice immunized with HIV Gag 

DNA (compare Effector column, Table 9, to immunization 
schedule, Table 8) . Effector cells from the Gag DNA- 
immunized animals exhibited specific lysis of Gag p7g 
peptide-pulsed SV-BALB (MHC matched) targets cells 

3 0 indicative of a CTL response. Target cells that were 
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peptide -pulsed and derived from an MHC-unmatched mouse 
strain (MC57) were not lysed (Table 9; MC/p7g) . 

Table 9 



Table 9. Cytotoxic T-lymphocyte 


(CTL) responses in 


L L l J- V_, 


immunized with HIV 


-1 gag DNA 






Percent 


specific lysis of 






target cells* 


Immunization 


E:T 


SVBALB 


SVBALB 


RMA 






none 


p7g 


p7g 


20 fig DNA 


100 : 1 


2 


49 


<1 


gagmod 


30:1 


3 


30 


<1 




10:1 


<1 


14 


<1 


2 ng DNA 


100:1 


2 


37 


<1 


gagmod 


30:1 


2 


21 


<1 




10:1 


<1 


13 


<1 


0.2 fig DNA 


100:1 


2 


32 


<1 


gagmod 


30:1 


3 


25 


<1 




10:1 


1 


14 


<1 


0.02 /xg DNA 


100:1 


1 


17 


<1 


gagmod 


30:1 


1 


16 


<1 




10:1 


1 


8 


<1 


2 0 jig DNA 


100:1 


2 


49 


<1 


gag native 


30:1 


2 


24 


<1 




10:1 


1 


12 


<1 


2 /^g DNA 


100:1 


<1 


18 


<1 


gag native 


30:1 


1 


14 


<1 




10:1 


1 


7 


<1 


0.2 fig DNA 


100:1 


3 


30 


<1 


gag native 


30:1 


3 


17 


<1 




10:1 


2 


7 


<1 


0.02 fig DNA 


100:1 


4 


2 


<1 


gag native 


30:1 


1 


2 


<1 




10 :1 


1 


2 


<1 
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'representative results of two animals per DNA-dose; positive 
CTL responses are indicated by boxed data 

The results of the CTL assays show increased potency of 
synthetic Gag expression cassettes for induction of cytotoxic 
T- lymphocyte (CTL) responses by DNA immunization. 

Example 9 

In vivo Immunization with Env polypeptides 
A. Immunogenic ity Study of US 4 o-gpl40 in Ras-3c Adjuvant 
System 

Studies have been conducted using rabbits immunized with 
US4 o-gp!40 purified as described above. Studies are also 
underway in animals to determine immunogenic ity of US 4 gpl20 / 
SF162 o-gpl40 and SF162 gpl20. 

Two rabbits (#1 and #2) were immunized intramuscularly at 
0, 4, 12 and 24 weeks with 50 fig of US 4 o-gpl4 0 in the Ribi™ 
adjuvant system (RAS-3c) , (Ribi Immunochem, Hamilton, MT) 
containing 2% Squalene, 0.2% Tween 80, and one or more 
bacterial cell wall components from the group consisting of 
monophosphorylipid A (MPL, Ribi Immunochem, Hamilton, MT) . In 
each experiment described herein, o-gpl40 can be native, 
mutated and/or modified. Antibody responses directed against 
the US 4 o-gpl40 protein were measured by ELISA. Results are 
shown in Table 10. 
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Table 10 



Rabbit/ sample 


Approximate o-gpl40 ELISA 
titer 


pre - immuni zat ion 


0 


#1: postl (0 week immuniz) 


400 


#1: post2 (4 week immuniz) 


15,000 


#1: post3 (12 week immuniz) 


50,000 


#1: post4 (24 week immuiz) 


100, 000 


#2: postl (0 week immuniz) 


600 


#2: post2 (4 week immuniz) 


12,000 


#2: post3 (12 week immuniz) 


25, 000 


#2: post4 (24 week immuiz) 


55,000 



The avidities of antibodies directed against the US 4 o- 
gpl40 protein were measured in a similar ELISA format 
15 employing successive washes with increasing concentrations of 
ammonium isothiocynate . Results are shown in Table 11. 



Table 11 



Time of sample 


Approx. Antibody avidity (NH 4 HCN 
Cone . in M) 


pre -immuni zat ion 


0.02 


postl (0 week immuniz) 


1.8 


post2 (4 week immuniz) 


3.5 


post 3 (12 week immuniz) 


5.5 


post4 (24 week immuniz) 


5.1 



25 
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These results show that US4 o-gpl40 is highly immunogenic 
and able to induce substantial antibody responses after only 
one or two immunizations. 

5 B. Immunogenicity of US4 o-gpl40 in MF59 -based Adjuvants 

Groups of 4 rabbits were immunized intramuscularly at 0, 
4, 12 and 24 weeks with various doses of US 4 o-gpl40 protein 
in three different MF59-based adjuvants (MF59 is described in 
International Publication No. WO 90/1483 7 and typically 
10 contains 5% Squalene, 0.5% Tween 80, and 0.5% Span 85). 
Antibody titers were measured post -third by ELISA using SF2 
gpl20 to coat the plates. QHC is a quill-based adjuvant 
(Iscotek, Uppsala, Sweden). Results are shown in Table 12. 

15 Table 12 



Antigen dose (jug) 


Adjuvant 


Anti-gpl20 sp2 Ab GMT* 


12 .5 


MF59 


7231 


25 


MF59 


8896 


50 


MF59 


12822 


12.5 


MF59/MPL 


24146 


25 


MF59/MPL 


27199 


50 


MF59/MPL 


23059 


50 


MF59/MPL/QHC 


31759 



2 5 *GMT = geometric mean titer 

Thus, adjuvanted o-gpl40 generated antigen-specific 
antibodies. Further, the antibodies were shown to increased 
in avidity over time. 
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C. Neutralizing Antibodies 

Neutralizing antibodies post-third immunization were 
measured against HIV-1 SF2 in a T-cell line adapted virus 
(TCLA) assay and against PBMC-grown HIV-1 variants SF2, SF162 
and 119 using the CCR5+ CEMxl74 LTR-GFP reporter cell line, 
5.25 (provided by N. Landau, Salk Institute, San Diego, CA) as 
target cells. Results are shown in Table 13. 
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Table 13 

Neutralizing antibody responses in rabbits immunized with 

o-gpl40 .modUS4 protein 



Group 


Animal 


SF2 


SF2 


SF162 


. 119 






TCLA* 


PBMC* 


PBMC* 


PBMC* 


Exp e 3T imen t 1 












o-gpl40/ 


217 


>640 


100% 


49 


17 


Ras-3c 












50 mg 


218 


>640 


96 


37 


29 


I. J-J.utri.JL u ^ 












o-gpl4 0/ 


/ yz 


A ^ 


71 




26 


MF59 












50 mg 


/ y o 


D U 


R7 
o / 




4 




/ y4 


z>y 






n 




■7 Q ET 

/ y o 


I/O 








/~\ rrn 1 A C\ 1 

o-gpi4 u / 


Out 




91 


47 


18 


MF59 + MPL 












50 mg 


805 


134 


93 


28 


4 




806 


N.D. ** 


95 


49 


13 




807 


441 


100 


31 


15 


o-gpl40/MF59 


808 


465 


98 


46 


40 


+ MPL + QHC 












50 mg 


809 


496 


100 


44 


39 




810 


>640 


101 


27 


4 




811 


92 


92 


24 


37 



*TCLA neutralizing antibody titers (50% inhibition) . 



**Not Determined 

# % Inhibition at 1:10 dilution of sera with any detectable 
non-specific inhibition in pre-bleeds subtracted. 

The above studies in rabbits indicate that the US 4 o- 
gpl40 protein is highly immunogenic. When administered with 
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adjuvant, this protein was able to induce substantial antibody 
responses after only one or two immunizations. Moreover, the 
adjuvanted o-gpl40 protein was able to generate antigen- 
specific antibodies which increased in avidity after 
successive immunizations, and substantial neutralizing 
activity against T-cell line adapted HIV-1. Neutralizing 
activity was also observed against PBMC-grown primary HIV 
strains, including the difficult to neutralize CCR5 co- 
receptor (R5) -utilizing isolates, SF162 and 119. 

Example 10 

In Vivo Immunocrenicity of Synthetic Env Expression Cassettes 

A. General Immunization Methods 

To evaluate the immunogenicity of the synthetic Env 
expression cassettes, studies using guinea pigs, rabbits, 
mice, rhesus macaques and baboons were performed. The studies 
were structured as follows: DNA immunization alone (single or 
multiple) ; DNA immunization followed by protein immunization 
(boost) ; DNA immunization followed by Sindbis particle 
immunization; immunization by Sindbis particles alone. 

B . Humoral Immune Response 

The humoral immune response was checked in serum 
specimens from immunized animals with an anti-HIV Env antibody 
ELISAs (enzyme -linked immunosorbent assays) at various times 
post -immunization. The antibody titers of the sera were 
determined by anti-Env antibody ELISA as described above. 
Briefly, sera from immunized animals were screened for 
antibodies directed against the HIV gpl20 or gpl4 0 Env 
protein. Wells of ELISA microtiter plates were coated 
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overnight with the selected Env protein and washed four times; 
subsequently, blocking was done with PBS-0.2% Tween (Sigma) 
for 2 hours. After removal of the blocking solution, 100 /xl 
of diluted mouse serum was added. Sera were tested at 1/25 
dilutions and by serial 3 -fold dilutions, thereafter. 
Microtiter plates were washed four times and incubated with a 
secondary, peroxidase -coupled anti-mouse IgG antibody (Pierce, 
Rockford, IL) . ELISA plates were washed and 100 /xl of 3 , 3', 
5, 5 1 -tetramethyl benzidine (TMB; Pierce) was added per well. 
The optical density of each well was measured after 15 
minutes. Titers are typically reported as the reciprocal of 
the dilution of serum that gave a half -maximum optical density 
(O.D.) . 

Example 11 

DNA- immunization of Baboons Using Synthetic Gag Expression 

Cassettes 

A . Baboons 

Four baboons were immunized 3 times (weeks 0, 4 and 8) 
bilaterally, intramuscular into the quadriceps using lmg 
pCMVKM2.GagMod.SF2 plasmid-DNA (Example 1) . The animals were 
bled two weeks after each immunization and a p24 antibody 
ELISA was performed with isolated plasma. The ELISA was 
performed essentially as described in Example 5 except the 
second antibody- conjugate was an anti-human IgG, g-chain 
specific, peroxidase conjugate (Sigma Chemical Co., St. Louis, 
MD 63178) used at a dilution of 1:500. Fifty fig /ml yeast 
extract was added to the dilutions of plasma samples and 
antibody conjugate to reduce non-specific background due to 
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preexisting yeast antibodies in the baboons. The antibody 
titer results are presented in Table 14. 



Table 14 







Immunizati 


Weeks 


I Antigen 


wpi 


V 


Baboon No. 


Ab-titer b 




5 


on no. 




1 














1 


0 


gagmod 




0 


w/219 


< 10 










DNA 




0 


w/220 


< 10 














0 


w/221 


< 10 














0 


w/222 


< 10 




10 




6 




2 


wp 


lst/219 


< 10 












2 


wp 


lst/220 


< 10 












2 


wp 


lst/221 


< 10 












2 


wp 


lst/222 


15 


w* 

'?S|S : 




4 


14 


gagmod 


2 


wp 


4th/219 


< 10 




15 






DNA 


2 


wp 


4th/220 


88 












2 


wp 


4th/221 


< 10 


'■bt z 










2 


wp 


4th/222 


56 






5 


30 


gagmod 


2 


wp 


5th/219 


< 10 


'•hsx 








DNA 


2 


wp 


5th/220 


391 




20 








2 


wp 


5th/221 


237 












2 


wp 


5th/222 


222 






6 


46 


gag VLP 


2 


wp 


6th/219 


753 










protein 


2 


wp 


6th/219 


4330 












2 


wp 


6th/219 


5000 




25 








2 


wp 


6th/219 


2881 



a wpi = weeks post immunization 



b geometric mean antibody titer 

In Table 14, pre -bleed data are given as Immunization 
30 No. 0; data for bleeds taken 2 weeks post-first immunization 
are given as Immunization No. 1; data for bleeds taken 2 
weeks post-second immunization are given as Immunization No. 
2; and, data for bleeds taken 2 weeks post-third 
immunization are given as Immunization No. 3. 
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Further, lymphoprolif erative responses to p24 antigen 
were also observed in baboons 221 and 222 two weeks post- 
fourth immunization (at week 14), and enhanced substantially 
post-boosting with VLP (at week 44 and 76) . Such 
proliferation results are indicative of induction of T- 
helper cell functions. 

B . Rhesus Macaques 

The improved potency of the codon-modif ied gag 
expression plasmid observed in mouse and baboon studies was 
confirmed in rhesus macaques. Four of four macaques had 
detectable Gag-specific CTL after two or three 1 mg doses of 
modified gag plasmid. In contrast, in a previous study, 
only one of four macaques given 1 mg doses of plasmid-DNA 
encoding the wild- type HIV-1 SP2 Gag showed strong CTL 
activity that was not apparent until after the seventh 
immunization. Further evidence of the potency of the 
modified gag plasmid was the observation that CTL from two 
of the four rhesus macaques reacted with three 
nonoverlapping Gag peptide pools, suggesting that as many as 
three different Gag peptides are recognized and indicating 
that the CTL response is polyclonal. Additional 
quantification and specificity studies are in progress to 
further characterize the T cell responses to Gag in the 
plasmid-immunized rhesus macaques. DNA immunization of 
macaques with the modified gag plasmid did not result in 
significant antibody responses, with only two of four 
animals seroconverting at low titers. In contrast, in the 
same study the majority of macaques in groups immunized with 
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p55Gag protein seroconverted and had strong Gag-specific 
antibody titers. These data suggest that a prime-boost 
strategy (DNA-prime and protein-boost) could be very 
promising for the induction of a strong CTL and antibody 
5 response . 

In sum, these results demonstrate that the synthbtic 
Gag plasmid DNA is immunogenic in non-human primates. When 
similar experiments were carried out using wild-type Gag 
plasmid DNA no such induction of anti-p24 antibodies was 
10 observed after four immunizations. 

Example 12 

DNA- and Prote in Immunizations of Animals Using Knv 
Expression Cassettes and Polypeptides 
15 A. Guinea Pigs 

Groups comprising six guinea pigs each were immunized 
intramuscularly at 0, 4, and 12 weeks with plasmid DMAs 
encoding the gpl20 .modUS4 , gpl4 0 .modUS4 , gpl4 0 .modUS4 .del VI, 
gpl4 0.modUS4.delV2, gpl40.modUS4.delVl/V2, or gpl60.modUS4 

20 coding sequences of the US4- derived Env. The animals were 

subsequently boosted at 18 weeks with a single intramuscular 
dose of US 4 o-gpl40.mut .modUS4 protein in MF59 adjuvant. 
Anti-gpl2 0 SF2 antibody titers (geometric mean titers) were 
measured at two weeks following the third DNA immunization 

25 and at two weeks after the protein boost. Results are shown 
in Table 15. 
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Table 15 



Group 


GMT post-DNA 


GMT post -protein 




immuni z . 


boost 


gpl20 .modUS4 


2098 


9489 


gp±4 u . moauS4 


190 


5340 


gpl40 .modUS4 . delVl 


341 


7808 


gpl4 0 . modUS4 . delV2 


386 


8165 


gpl40 .modUS4 .delVl/V2 


664 


8270 


gpl60 .modUS4 


235 


9928 



10 These results demonstrate the usefulness of the 

synthetic constructs to generate immune responses, as well 
as, the advantage of providing a protein boost to enhance 
the immune response following DNA immunization. 

15 B. Rabbits 

Rabbits were immunized intramuscularly and 
intradermally using a Bioject needless syringe with plasmid 
DNAs encoding the following synthetic SF162 Env 
polypeptides: gpl20 .modSF162 , gpl20 .modSF162 . delV2 , 

20 gpl40.modSF162, gpl40 .modSF162 . delV2 , gpl40 .mut .modSF162 , 
gpl40.mut .modSF162 .delV2, gpl60 . modSF162 , and 
gpl60.modSF162.delV2. Approximately 1 mg of plasmid DNA 
(pCMVlink) carrying the synthetic Env expression cassette 
was used to immunize the rabbits. Rabbits were immunized 

25 with plasmid DNA at 0, 4, and 12 weeks. At two weeks after 
the third immunization all of the constructs were shown to 
have generated significant antibody titers in the test 
animals. Further, rabbits immunized with constructs 
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containing deletions of the V2 region generally generated 
similar antibody titers relative to rabbits immunized with 
the companion construct still containing the V2 region. 

The nucleic acid immunizations are followed by proteii 
boosting with o-gpl4 0 .modSF162 .delV2 (0.1 mg of purified 
protein) at 24 weeks after the initial immunization. 
Results are shown in Table 16. 

Table 16 



Group 


GMT 2wks 
post-2nd DNA 
immunization 


GMT 2wks 
post-3rd DNA 
immuni z a t i on 


GMT 2wks 

post-protein 

boost 


gpl2 0.modSF162 


4573 


5899 


26033 


gpl20 ,modSF162 .delV2 


3811 


3122 


29606 


gp!40.modSF162 


1478 


710 


12882 


gpl40 .modSF162 ,delV2 


1572 


819 


11067 


gpl40 . mut .modSF162 


1417 


788 


8827 


gpl4 0 . mut . modSF162 . delV2 


1378 


1207 


13301 


gpl60.modSF162 


23 


81 


7050 


gp!60.modSF162 ,delV2 


85 


459 


11568 



All constructs are highly immunogenic and generate 
substantial antigen binding antibody responses after only 2 
immunizations in rabbits. 

C . Baboons 

Groups of four baboons were immunized intramuscularly 
with 1 mg doses of DNA encoding different forms of synthetic 
US4 gpl40 (see the following table) at 0, 4, 8, 12, 28, and 
44 weeks. The animals were also boosted twice with US 4 0- 
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gpl40 protein (gpl40 .mut ,modUS4) at 44 and 76 weeks using 
MF59 as adjuvant. Results are shown in Table 17. 
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5 



15 





Table 17 




Treatment 


2 Wks Post 

5th DNA 
immuniza- 
tion 


2 Wks post 
6th DNA 
(plus o- 
gpl40 prot. 
immuniz . ) 


2 Wks post 

gpi4 0 

protein 
only) 


CY 215 




8.3 


446 


1813 


CY 216 


gpl4 0 .modUS4 


8.3 


433 


1236 


CY 217 




68 


1660 


2989 


CY 218 




101 


2556 


1610 


Geomean : 




26.2 


951.4 


1812.1 


V* X Zi J. J 




8.3 


8.3 


421 


CY 22 0 


gpl4 0 .modUS4 


8.3 


8.3 


3117 


CY 221 


+ p55gag.SF2 


8.3 


954 


871 


CY 222 




8.3 


71 


916 


Geomean : 




8.3 


46.5 


1011.5 






41.4 


10497 


46432 


CY 224 


gpl4 0.mut. 


8.3 


979 


470 


CY 22 5 


modUS 4 


135 


2935 


3870 


CY 22 6 




47 


1209 


4009 


OCvJlUccLIi ♦ 




68.3 


2457.4 


4289.6 


CY 227 




8.3 


56 


5001 


^ 1 ZZO 


gpl40TM. 


8.3 


806 


1170 


CY 22 9 


modUS 4 


8.3 


48 


3402 


CY 230 




8.3 


38 


6520 


GMT* : 




8.3 


95.3 


3375.3 



*GMT = geometric mean titer 



The results in Table 17 demonstrate the usefulness of 
the synthetic constructs to generate immune responses in 
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primates such as baboons. In addition, all animals showed 
evidence of antigen- specific (Env antigen) lymphopro- 
liferative responses. 



5 D. Rhesus Macaques 

Two rhesus macaques (designated H445 and J4 08) were 
immunized with 1 mg of DNA encoding SF162 gpl40 with a 
deleted V2 region (SF162 .gpl40 .delV2) by intramuscular (IM) 
and intradermal (ID) routes at 0, 4, 8, and 2 8 weeks. 

10 Approximately 100 /zg of the protein encoded by the SF162 . 

gpl40mut . delV2 construct was also administered in MF59 by IM 
delivery at 28 weeks. 

ELISA titers are shown in Figure 61. Neutralizing 
antibody activity is shown Tables 18 and 19. Neutralizing 

15 antibody activity was determined against a variety of 

primary HIV-1 isolates in a primary lymphocyte or "PBMC- 
based" assay (see the following tables) . Further, the 
phenotypic co- receptor usage for each of the primary 
isolates is indicated. As can be seen in the tables 

20 neutralizing antibodies were detected against every isolate 
tested, including the HIV-1 primary isolates (i.e., SF128A, 
92US660, 92HT593, 92US657, 92US714, 91US056, and 91US054) . 
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Table 18 




Treatment 


Bleed 
0 


Bleed 1 


Bleed 2 


Animal 


1st 


2nd 


1st 


2nd 


2 Wks 




Immunization 


Immunization 


Imm'n 


Imm'n 


post 2nd 


EO 456 






8.3 


45 


309 


EO 457 






8.3 


254 


^460 


EO 458 


^£Dply lZUIuOU 


\rjone j 


8.3 


8.3 


93 


EO 459 


DNA 




8.3 


43 


45 


EO 460 






8.3 


8.3 


274 


EO 461 






8.3 


47 


1502 


EO 462 






8.3 


80 


5776 


EO 463 


25/zg 120mod 


25/iCf 120mod 


8.3 


89 


3440 


EO 464 


DNA 


DNA 


8.3 


8.3 


3347 


EO 465 






8.3 


69 


1127 


EO 466 






8.3 


63 


102 


EO 467 






8.3 


112 


662 


EO 468 


C ft t 1 1 O f\ m /^l 

D U /iy U ulOCL 


vJxone / 


8.3 


94 


459 


EO 469 


DNA 




8.3 


58 


48 


EO 470 






8.3 


95 


355 


EO 471 






8.3 


110 


9074 


EO 472 






8.3 


8.3 


4897 


EO 473 


b 0 /ICf 1^2 omou 


5 0 /icj 12 Omod 


8.3 


49 


4089 


EO 474 


DNA 


DNA 


8.3 


59 


5280 


EO 475 






8.3 


8.3 


929 


EO 476 






8.3 




653 


EO 477 






8.3 


87 


22675 


EO 478 


25/xg 120mod 


Sindbis/Env 


8.3 


76 


3869 


EO 479 


DNA 




8.3 




1004 


EO 480 






8.3 


71 


7080 
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Table 1<? 




Treatment 


Bleed 
0 


Bleed 1 


Bleed 2 


Animal 


1st 

Iromun "i v ^ t- 1 nn 


2nd 

Tmmiin *i f~ "J j*m-i 
XUUIILUI J. ^awlUU 


1st 
Imm' n 


2nd 
Imm'n 


2 Wks 
post 2nd 


Ei\J t O X 






8.3 


8.3 


8.3 


£lV ftOZ 






8.3 


8.3 


8.3 




Sindbis/Env 


(None) 


8.3 


78 


103 


rr«/^i a a a 
EO 484 






8.3 


8.3 


32 


EO 485 






8.3 


76 


207 


EO 486 






8.3 


8.3 


458 


EO 487 






8.3 


8.3 


345 


EO 488 


Sindbis/Env 


Sindbis/Env 


8.3 


8.3 


331 


EO 489 






8.3 


103 


111 


EO 490 






8.3 


8.3 


5636 
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Lymphoprol iterative activity (LPA) was also determined 
by antigenic stimulation followed by uptake of 3 H- thymidine 
in these animals and is shown in Table 20. Experiment 1 was 
performed at 14 weeks post third DNA immunization and 
Experiment 2 was performed at 2 weeks post fourth DNA 
immunization using DNA and protein. For gpl20ThaiE, -■ 
gpl20SF2 and US 4 o-gpl40 / appropriate background values were 
used to calculate Stimulation Indices (S.I.; Antigenic 
stimulation CPM/Background CPM) . 



Table 2 0 



;S.I.: Calculated as Ag CPM/Background CPM 


Animal/ exp# 


gpl20ThaiE 


gpl20 SF2 


env2-3SF2 


o-gpl40US4 


J408/#l 


2 


1 


1 


5 


H445/#l 


1 


1 


1 


6 


J408/#2 


1 


1 


2 


3 


H445/#2 


0 


0 


3 


2 



As can be seen by the results presented in Table 2 0 
lymphoprol iterative responses to o-gpl40.US4 antigen were 
also in all four animals at both experimental time points. 
Such proliferation results are indicative of induction of T- 
helper cell functions. 

The results presented above demonstrate that the 
synthetic gpl40 .modSF162 . delV2 DNA and protein are 
immunogenic in non-human primates. 
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Example 13 

In vitro expression of recombinant Sindbis RNA and DNA 
containing the synthetic Gag or Env expression cassettes 

A, Synthetic Gag expression cassettes 

5 To evaluate the expression efficiency of the synthetic 

Gag expression cassette in Alphavirus vectors, the synthetic 
Gag expression cassette was subcloned into both plasmid DNA- 
based and recombinant vector particle-based Sindbis virus 
vectors. Specifically, a cDNA vector construct for in vitro 

10 transcription of Sindbis virus RNA vector replicons (pRSIN- 
luc; Dubensky, et al . , J Virol. 70:508-519, 1996) was 
modified to contain a Pmel site for plasmid linearization 
and a polylinker for insertion of heterologous genes. A 
polylinker was generated using two oligonucleotides that 

15 contain the sites Xhol, Pmll, Apal, Narl , Xbal , and NotI 
(XPANXNF, SEQ ID NO:17, and XPANXNR , SEQ ID NO: 18) . 

The plasmid pRSIN-luc (Dubensky et al . , supra) was 
digested with Xhol and NotI to remove the luciferase gene 
insert, blunt -ended using Klenow and dNTPs, and purified 

20 from an agarose get using GeneCleanll (BiolOl, Vista, CA) . 

The oligonucleotides were annealed to each other and ligated 
into the plasmid. The resulting construct was digested with 
NotI and Sad to remove the minimal Sindbis 3 1 -end sequence 
and A 40 tract, and ligated with an approximately 0.4 kbp 

25 fragment from PKSSIN1-BV (WO 97/38087). This 0.4 kbp 

fragment was obtained by digestion of pKSSINl-BV with jNdtl 
and Sad, and purification after size fractionation from an 
agarose gel. The fragment contained the complete Sindbis 
virus 3 f -end, an A 40 tract and a Pmel site for 
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linearization. This new vector construct was designated SINBVE. 
The synthetic HIV Gag coding sequence was obtained from 

the parental plasmid by digestion with EcoRL, blunt-ending 

with Klenow and dNTPs, purification with GeneCleanll, 
5 digestion with Sail, size fractionation on an agarose gel, 

and purification from the agarose gel using GeneCleanll. 

The synthetic Gag coding fragment was ligated into the 

SINBVE vector that had been digested with Xhol and Pmll . 

The resulting vector was purified using GeneCleanll and 
10 designated SINBVGag. Vector RNA replicons may be 

transcribed in vitro (Dubensky et al . , supra) from SINBVGag 
*2 and used directly for transfection of cells. Alternatively, 

Jp the replicons may be packaged into recombinant vector 

particles by co-transf ection with defective helper RNAs or 
U1 15 using an alphavirus packaging cell line as described, for 
In example, in U.S. Patent Numbers 5,843,723 and 5,789,245, and 

f then administered in vivo as described. . 

m The DNA-based Sindbis virus vector pDCMVSIN-beta-gal 

(Dubensky, et al . , J Virol. 70:508-519, 1996) was digested 

2 0 with Sail and Xbal, to remove the beta-galactosidase gene 
^ insert, and purified using GeneCleanll after agarose gel 

size fractionation. The HIV Gag gene was inserted into the 
the pDCMVSIN-beta-gal by digestion of SINBVGag with Sail and 
Xhol, purification using GeneCleanll of the Gag -containing 
25 fragment after agarose gel size fractionation, and ligation. 
The resulting construct was designated pDSIN-Gag, and may be 
used directly for in vivo administration or formulated using 
any of the methods described herein. 

BHK and 2 93 cells were transfected with recombinant 

3 0 Sindbis vector RNA and DNA, respectively. The supernatants 
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and cell lysates were tested with the Coulter p24 capture 
ELISA (Example 2) . 

BHK cells were transfected by electroporation with 
recombinant Sindbis RNA. The expression of p24 (in ng/ml) 
5 is presented in Table 21. In the table, SINGag#l and 2 

represent duplicate measurements, and SINpgal represents a 
negative control. Supernatants and lysates were collected 
24h post transf ection. 

10 Table 21 



Construct 


Supernatant 


Lysate 


SINpgal RNA 


0 


0 


SINGag#l RNA 


7 ng 


Max (approx. 1 ^g) 


SINGag#2 RNA 


1 ng 


700 ng 



15 

293 cells were transfected using LT-1 (Example 2) with 
recombinant Sindbis DNA. Synthetic pCMVKM2GagMod . SF2 was 
used as a positive control. Supernatants and lysates were 
collected 48h post transf ection. The expression of p24 (in 
20 ng/ml) is presented in Table 22. 



Table 22 



Construct 


Supernatant 


Lysate 


SINGag DNA 


3 


30 


pCMVKM2 . GagMod . SF2 
DNA 


32 


42 



The results presented in Tables 21 and 22 demonstrate 
that Gag proteins can be efficiently expressed from both DNA 
3 0 and RNA-based Sindbis vector systems using the synthetic Gag 
expression cassette (p55Gag.mod) . 
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B. Synthetic Env expression cassettes 

To evaluate the expression efficiency of the synthetic 
Env expression cassette in Alphavirus vectors, synthetic Env 
expression cassettes were subcloned into both plasmid DNA- 
5 based and recombinant vector particle-based Sindbis virus 
vectors as described above for Gag. 

The synthetic HIV Env coding sequence was obtained from 
the parental plasmid by digestion with Sail and Xbal, size 
fractionation on an agarose gel, and purification from the 

10 agarose gel using GeneCleanll. The synthetic Env coding 
fragment was ligated into the SINBVE vector that had been 
digested with Xhol and Xbal. The resulting vector was 
purified using GeneCleanll and designated SINBVEnv. Vector 
RNA replicons may be transcribed in vitro (Dubensky et al . , 

15 supra.) from SINBVEnv and used directly for transfection of 
cells. Alternatively, the replicons may be packaged into 
recombinant vector particles by co-transf ection with 
defective helper RNAs or. using an alphavirus packaging cell 
line and administered as described above for Gag. 

20 The DNA-based Sindbis virus vector pDCMVSIN-beta-gal 

(Dubensky, et al., J Virol. 70:508-519, 1996) was digested 
with Sail and Xbal, to remove the beta-galactosidase gene 
insert, and purified using GeneCleanll after agarose gel 
size fractionation. The HIV Env gene was inserted into the 

25 the pDCMVSIN-beta-gal by digestion of SINBVEnv with Xbal and 
Xhol, purification using GeneCleanll of the Env- containing 
fragment after agarose gel size fractionation, and ligation. 
The resulting construct was designated pDSIN-Env, and may be 
used directly for in vivo administration or formulated using 

30 any of the methods described herein. 
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BHK and 2 93 cells were transfected with recombinant 
Sindbis vector RNA and DNA, respectively. The supernatants 
and cell lysates were tested by capture EL1SA. 

BHK cells were transfected by electroporation with 
5 recombinant Sindbis RNA. The expression of Env (in ng/ml) 
is presented in Table 23. In the table, the Sindbis RNA 
containing synthetic Env expression cassettes are indicated 
and pgal represents a negative control. Supernatants and 
lysates were collected 24h post transfection. 

10 



Table 23 





Construct 


Supernatant 
(Neat) ng/ml 


Lysate 

(1:10 dilution) ng/ml 


3 sis: 


3gal RNA 


0 


0 




gpl40 .modUS4 


726 


7147 


" Ji 15 


gpl4 0.modSF162 


3529 


7772 




gpl4 0 . modUS 4 . del VI /V2 


1738 


6526 




gpl4 0 . modUS4 . delV2 


960 


3023 




gpl40 .modSF162 .delV2 


2772 


3359 



20 



2 93 cells were transfected using LT-1 mediated 
transfection (PanVera) with recombinant Sindbis DNA 
containing synthetic expression cassettes of the present 
invention and 3gal sequences as a negative control. 
25 Supernatants and lysates were collected 48h post 

transfection. The expression of Env (in ng/ml) is presented 
in Table 24. 
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Table 24 



Construct 


Supernatant 


Lysate 




(Neat) ng/ml 


(1 :10 






dilution) ng/ml 


Pgal 


0 


0 


gpl40 . modSF162 . delV2 


1977 


801 


gpl40 .modSF162 


949 


746 



The results presented in Tables 23 and 24 demonstrated 
that Env proteins can be efficiently expressed from both DNA 
and RNA-based Sindbis vector systems using the synthetic Env 
10 expression cassettes of the present invention. 

Example 14 

A. In vivo Immunization with Gag- containing DNA and/or 
Sindbis particles 

15 CB6F1 mice were immunized intramuscularly at 0 and 4 

weeks with plasmid DNA and/or Sindbis vector RNA- containing 
particles each containing GagMod.SF2 sequences as indicated 
in Table 25. Animals were challenged with recombinant 
vaccinia expressing SF2 Gag at 3 weeks post second 

2 0 immunization (at week 7) . Spleens were removed from the 

immunized and challenged animals 5 days later for a standard 
51 C release assay for CTL activity. Values shown in Table 
25 indicate the results from the spleens of three mice from 
each group. The boxed values in Table 2 5 indicate that all 

2 5 groups of mice receiving immunizations with 

pCMVKm2.GagMod.SF2 DNA and/or SindbisGagMod. SF2 virus 
particles either alone or in combinations showed antigen- 
specific CTL activity. 
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Table 25 



10 



Cytotoxic T-lymphocyte (CTL) responses in mice immunized with HIV-1 


gagmod DNA and Sindbis gagmod 


virus particles 










Percent specific lysis of 






target cells* 




Immunization 


E:T 


SVBALB 


SVBALB 


RMA 






none 


p7g 


p7g 


pCMVKm2 . GagMoa . SF2 DNA 61 


100 : 1 


5 


20 


1 


at 0, 4 wks 


25:1 


5 


20 


<1 




6:1 


4 


8 


<1 


S indbi sGagMod . SF2 


1(J U : 1 


10 


49 


<1 


virus particles ^ 


25:1 


7 


20 


<1 


at 0, 4 weeks 


6:1 


5 


12 


<1 


pCMVKm2.GagMod.SF2 DNA at 0 


100:1 


9 


58 


<1 


wks SindbisGagMod. SF2 virus 


25:1 


7 


42 


2 


particles at 4 wks 


6:1 


4 


13 


<1 


SindbisGagMod . SF2 


100:1 


5 


38 


<1 


virus particles at 4 wks 


25:1 


4 


18 


<1 


pCMVKm2.GagMod.SF2 DNA at 0 wks 


6:1 


3 


13 


1 



a 20 fig 
b 10 7 particles 



2 0 * Challenge with recombinant vaccinia virus expressing HIV-1SF2 
Gag at 3 weeks post second immunization (week 7) . Spleens taken 
5 days later. Ex vivo CTL assay performed by standard 51 Cr 
release assay. Values seen represent results from 3 pooled mouse 
spleens per group 

25 

B. In vivo Immunization with Env-containing DNA and/or 

Sindbis particles 
Balb/C mice were immunized intramuscularly at 0 and 4 
weeks (as shown in the following table) with plasmid DNA 
30 and/or Sindbis-virus RNA-containing particles each 

containing gpl20.modUS4 sequences. Treatment regimes and 
antibody titers are shown in Table 26. Antibody titers were 
determined by ELISA using gpl20 SF2 protein to coat the 
plates . 
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10 



15 



Table 26 




Treatment 




Bleed 


Bleed 1 


Bleed 2 








0 


(8 wks) 


(10 wks) 


Animal 


1st 


2nd 


1st 


2nd 


2 Wks 




Immunization 


Immunization 


Imm' n 


Imm' n 


post 2nd 


EO 456 






O.J 


At: 


J *J _7 


EO 457 






8.3 


254 


460 t 


EO 458 


25/ig 120mod 


(None) 


8.3 


8.3 


93 


EO 459 


DNA 




8 . 3 


43 


45 


EO 460 






8 . 3 


8 . 3 


274 


JjW *± O _L 






O.J 


4.7 




EO 462 






8.3 


80 


5776 


EO 463 


25/xg 120mod 


25/xg 12 0mod 


8.3 


89 


3440 


EO 464 


DNA 


DNA 


8.3 


8.3 


3347 


EO 465 






8.3 


69 


1127 


.CiU tc D O 






P 1 
O.J 


b J 


JLUZ 


EO 467 






8.3 


112 


662 


EO 468 


50/zg 120mod 


(None) 


8.3 


94 


459 


EO 469 


DNA 




8 . 3 


58 


48 


EO 470 






8 . 3 


95 


355 








p 

O.J 


1 1 ft 

± J_ u 




EO 472 






8.3 


8.3 


4897 


EO 473 


50/zg 120mod 


50/ig 12 0mod 


8.3 


49 


4089 


EO 474 


DNA 


DNA 


8.3 


59 


5280 


EO 475 






8.3 


8.3 


929 


t?o An c 
HiKJ *k f D 






p ^ 
o . j 






EO 477 






8.3 


87 


22675 


EO 478 


25/ig 12 0mod 


Sindbis/Env 


8.3 


76 


3869 


EO 479 


DNA 




8 . 3 




1004 


EO 480 






8.3 


71 


7080 


WO A PI 






O.J 


p 

O . j 


p *3 

O.J 


EO 482 






8.3 


8.3 


8.3 


EO 483 


Sindbis/Env 


(None) 


8.3 


78 


103 


EO 484 






8.3 


8.3 


32 


EO 485 






8.3 


76 


207 


EO 486 






8.3 


8.3 


458 


EO 487 






8.3 


8.3 


345 


EO 488 


Sindbis/Env 


Sindbis/Env 


8.3 


8.3 


331 


EO 489 






8.3 


103 


111 


EO 490 






8.3 


8.3 


5636 



20 



25 



30 



35 



40 As can be seen from the data presented above, all of 

the mice generally demonstrated substantial immunological 
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responses by bleed number 2. For Env, the best results were 
obtained using either (i) 50 /zg of gpl2 0 .modUS4 DNA for the 
first immunization followed by a second immunization using 
50 fxg of gpl20.modUS4 DNA, or (ii) 25 fig of gpl20.modUS4 DNA 
5 for the first immunization followed by a second immunization 
using 10 7 pfus of Sindbis. 

The results presented above demonstrate that the Env 
and Gag proteins of the present invention are effective to 
induce an immune response using Sindbis vector systems which 
10 include the synthetic Env (e.g., gpl20 .modUS4) or Gag 
expression cassettes . 



Example 15 

Co-Transf ection of Env and Gacr as Monocistronic and 

15 Bicistronic Constructs 

DNA constructs encoding (i) wild-type US 4 and SF162 Env 
polypeptides, (ii) synthetic US 4 and SF162 Env polypeptides 
(gpl60 .modUS4 / gpl60 .modUS4 . delVl/V2 , gpl60 .modSF162 , and 
gpl20 .modSF162 .delV2) , and (iii) SF2gag polypeptide (i.e., 

20 the Gag coding sequences obtained from the SF2 variant or 
optimized sequences corresponding to the gagSF2 
gag.modSF2) were prepared. These monocistronic constructs 
were co-transf ected into 293T cells in a transient 
transf ection protocol using the following combinations: 

25 gpl60.modUS4; gpl60.modUS4 and gag.modSF2; 

gpl60.modUS4.delVl/V2; gpl60 .modUS4 .delVl/V2 and gag.modSF2; 
gpl60.modSF162 and gag.modSF2; gp!20 .modSF162 .delV2 and 
gag.modSF2; and gag.modSF2 alone. 

Further several bicistronic constructs were made where 

3 0 the coding sequences for Env and Gag were under the control 
of a single CMV promoter and, between the two coding 
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sequences, an IRES (internal ribosome entry site (EMCV 
IRES) ; Kozak, M. , Critical Reviews in Biochemistry and 
Molecular Biology 27 (45) : 385-402, 1992; Witherell, G.W., et 
al . , Virology 214:660-663, 1995) sequence was introduced 
after the Env coding sequence and before the Gag coding 
sequence. Those constructs were as follows: 
gpl60 .modUS4 .gag.modSF2, SEQ ID NO:73 (Figure 61); 
gpl60.modUSF162.gag.modSF2, SEQ IDNO:74 (Figure 62); 
gpl60.modUS4.delVl/V2.gag.modSF2, SEQ ID NO:75 (Figure 63); 
and gpl60 .modSF162 .delV2 .gag.modSF2, SEQ ID NO: 76 (Figure 
64) . 

Supernatant s from cell culture were filtered through 
0.45 /xm filters then ultracentrif uged for 2 hours at 24,000 
rpm (140,000Xg) in an SW28 rotor through a 20% sucrose 
cushion. The pelleted materials were suspended and layered 
on a 2 0-60% sucrose gradient and spun for 2 hours at 40,000 
rpm (285,000Xg) in an SW41Ti rotor. Gradients were 
fractionated into 1.0 ml samples. A total of 9-10 fractions 
were typically collected from each DNA transfection group. 

The fractions were tested for the presence of the Env 
and Gag proteins (across all fractions) . These results 
demonstrated that the appropriate proteins were expressed in 
the transfected cells (i.e., if an Env coding sequence was 
present the corresponding Env protein was detected; if a Gag 
coding sequence was present the corresponding Gag protein 
was detected) . 

Virus like particles (VLPs) were known to be present 
through a selected range of sucrose densities. Chimeric 
virus like particles (VLPs) were formed using all the tested 
combinations of constructs containing both Env and Gag. 
Significantly more protein was found in the supernatant 
collected from the cells transfected with 

194 



1621.002 

2302-1621 

PATENT 



u gpl60.modUS4.delVl/V2 and gag.modSF2" than in all the other 
supernatants . 

Western blot analysis was also performed on sucrose 
gradient fractions from each transfection. The results show 
5 that bicistronic plasmids gave lower amounts of VLPs than 
the amounts obtained using co-transf ection with 
monocistronic plasmids. 

In order to verify the production of chimeric VLPs by 
these cell lines the following electron microscopic analysis 

10 was carried out. 

293T cells were plated at a density of 60-70% 
confluence in 100 mm dishes on the day before transfection. 
The cells were transfected with 10 fig of DNA in transfection 
reagent LT1 (Panvera Corporation, 545 Science Dr., Madison, 

15 WI) . The cells were incubated overnight in reduced serum 

medium (opt i -MEM, Gibco-BRL, Gaithersburg, MD) . The medium 
was replaced with 10% fetal calf serum, 2% glutamine in IMDM 
in the morning of the next day and the cells were incubated 
for 65 hours. Supernatants and lysates were collected for 

2 0 analysis as described above (see Example 2) . 

The fixed, transfected 2 93T cells and purified ENV-GAG 
VLPs were analyzed by electron microscopy. The cells were 
fixed as follows. Cell monolayers were washed twice with 
PBS and fixed with 2% glutaraldehyde . For purified VLPs, 

25 gradient peak fractions were collected and concentrated by 
ultracentrifugation (24,000 rpm) for 2 hours. Electron 
microscopic analysis was performed by Prof. T.S. Benedict 
Yen (Veterans Affairs, Medical Center, San Francisco, CA) . 
Electron microscopy was carried out using a 

30 transmission electron microscope (Zeiss 10c) . The cells 
were pre -stained with osmium and stained with uranium 
acetate and lead citrate. Immunostaining was performed to 
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visualize envelope on the VLP. The magnification was 
100, 000X. 

Figures 65A-65F show micrographs of 2 93T cells 
transfected with the following constructs: Figure 65A, 
gag.modSF2; Figure 65B, gpl60 .modUS4 ; Figure 65C, 
gpl60.modUS4.delVl/V2.gag.modSF2 (bicistronic Env and Gag); 
Figures 65D and 65E, gpl60 .modUS4 . delVl/V2 and gag.modSF2; 
and Figure 65F, gpl20 .modSF162 . delV2 and gag.modSF2. In the 
figures, free and budding immature virus-like-particles 
(VLPs) of the expected size (approximately 100 nm) decorated 
with the Env protein were seen. In sum, gpl60 polypeptides 
incorporate into Gag VLPs when constructs were co- 
transfected into cells. The efficiency of incorporation is 
2-3 fold higher when constructs encoding V-deleted Env 
polypeptides from high synthetic expression cassettes are 
used. 

Although preferred embodiments of the subject invention 
have been described in some detail, it is understood that 
obvious variations can be made without departing from the 
spirit and the scope of the invention as defined by the 
appended claims. 
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What Is Claimed Is: 



1. An expression cassette, comprising 

5 a polynucleotide sequence encoding a polypeptide 

including an HIV Gag polypeptide, wherein the polynucleotide 
sequence encoding said Gag polypeptide comprises a sequence 
having at least 90% sequence identity to the sequence 
presented as SEQ ID NO: 20. 

10 

2. The expression cassette of claim 1, comprising, 
a polynucleotide sequence encoding a polypeptide 

including an HIV Gag polypeptide, wherein the polynucleotide 
sequence encoding said Gag polypeptide comprises a sequence 
15 having at least 90% sequence identity to the sequence 
presented as SEQ ID NO: 9. 



3. The expression cassette of claim 1, wherein said 
polynucleotide sequence encoding a polypeptide including an 

2 0 HIV Gag polypeptide comprises a sequence having at least 90% 
sequence identity to the sequence presented as SEQ ID NO: 4. 

4. The expression cassette of claim 1, wherein said 
polynucleotide sequence further includes a polynucleotide 

25 sequence encoding an HIV protease polypeptide. 



5. The expression cassette of claim 4, wherein the 
nucleotide sequence encoding said polypeptide comprises a 
sequence having at least 90% sequence identity to a sequence 
3 0 selected from the group consisting of: SEQ ID NO: 5, SEQ ID 
NO: 78, and SEQ ID NO: 79. 
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6. The expression cassette of claim l, wherein said 
polynucleotide sequence further includes a polynucleotide 
sequence encoding an HIV reverse transcriptase polypeptide. 

7. The expression cassette of claim 6, wherein the 
nucleotide sequence encoding said polypeptide comprises a 
sequence having at least 90% sequence identity to a sequence 
selected from the group consisting of: SEQ ID NO: 80, SEQ ID 
NO: 81, SEQ ID NO: 82, SEQ ID NO: 83, and SEQ ID NO: 84. 

8. The expression cassette of claim 1, wherein said 
polynucleotide sequence further includes a polynucleotide 
sequence encoding an HIV tat polypeptide. 

9. The expression cassette of claim 8, wherein the 
nucleotide sequence encoding said polypeptide comprises a 
sequence having at least 90% sequence identity to a sequence 
selected from the group consisting of: SEQ ID NO: 87, SEQ ID 
NO : 8 8 , and SEQ ID NO: 89. 

10. The expression cassette of claim 1, wherein said 
polynucleotide sequence further includes a polynucleotide 
sequence encoding an HIV polymerase polypeptide, wherein the 
nucleotide sequence encoding said polypeptide comprises a 
sequence having at least 90% sequence identity to the 
sequence presented as SEQ ID NO: 6. 

11. The expression cassette of claim 1, wherein said 
polynucleotide sequence further includes a polynucleotide 
sequence encoding an HIV polymerase polypeptide, wherein (i) 
the nucleotide sequence encoding said polypeptide comprises 
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a sequence having at least 90% sequence identity to the 
sequence presented as SEQ ID NO: 4, and (ii) wherein the 
sequence is modified by deletions of coding regions 
corresponding to reverse transcriptase and integrase. 

12. The expression cassette of claim 11, wherein said 
polynucleotide sequence preserves T-helper cell and CTL 
epitopes . 

13. The expression cassette of claim 1, wherein said 
polynucleotide sequence further includes a polynucleotide 
sequence encoding an HCV core polypeptide, wherein the 
nucleotide sequence encoding said polypeptide comprises a 
sequence having at least 90% sequence identity to the 
sequence presented as SEQ ID NO: 7. 

14. An expression cassette, comprising a 
polynucleotide sequence encoding a polypeptide including an 
HIV Env polypeptide, wherein the polynucleotide sequence 
encoding said Env polypeptide comprises a sequence having at 
least 90% sequence identity to SEQ ID NO: 71 (Figure 58) or 
SEQ ID NO:72 (Figure 59). 

15. The expression cassette of claim 14, wherein said 
25 Env polypeptide includes sequences flanking a VI region but 

has a deletion in the VI region itself. 

16. The expression cassette of claim 15, wherein the 
polynucleotide sequence encoding the polypeptide comprises 

3 0 the sequence presented as SEQ ID NO: 65 (Figure 52 
gpl60.modUS4.delVl) . 
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17. The expression cassette of claim 14, wherein said 
Env polypeptide includes sequences flanking a V2 region but 
has a deletion in the V2 region itself. 

5 18. The expression cassette of claim 17, wherein the 

polynucleotide sequence encoding the polypeptide is selected 
from the group consisting of: SEQ ID NO:60 (Figure 47); and 
SEQ ID NO: 66 (Figure 53) . 

10 19. The expression cassette of claim 17, wherein the 

polynucleotide sequence encoding the polypeptide is selected 
from the group consisting of: SEQ ID NO: 34 (Figure 20) ; SEQ 
ID NO:37 (Figure 24); SEQ ID NO:40 (Figure 27); SEQ ID NO:43 
(Figure 30); SEQ ID NO:46 (Figure 33); SEQ ID NO:76 (Figure 

15 64) and SEQ ID NO: 49 (Figure 36) . 

20. The expression cassette of claim 14, wherein said 
Env polypeptide includes sequences flanking a VI /V2 region 
but has a deletion in the V1/V2 region itself. 

20 

21. The expression cassette of claim 20, wherein the 
polynucleotide sequence encoding the polypeptide is selected 
from the group consisting of: SEQ ID NO:59 (Figure 46); SEQ 
ID NO:61 (Figure 48); SEQ ID NO:67 (Figure 54); and SEQ ID 

2 5 NO: 75 (Figure 63) . 

22. The expression cassette of claim 20, wherein the 
polynucleotide sequence encoding the polypeptide is selected 
from the group consisting of: SEQ ID NO: 35 (Figure 21); SEQ 

30 ID NO:38 (Figure 25); SEQ ID NO:41 (Figure 28); SEQ ID NO:44 
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(Figure 31); SEQ ID NO:47 (Figure 34) and SEQ ID NO:50 
(Figure 3 7) . 

23. The expression cassette of claim 14, wherein said 
5 Env polypeptide has a mutated cleavage site that prevents 

the cleavage of a gpl40 polypeptide into a gpl20 polypeptide 
and a gp41 polypeptide. 

24. The expression cassette of claim 23, wherein the 
10 polynucleotide sequence encoding the polypeptide is selected 

from the group consisting of: SEQ ID NO: 57 (Figure 44); SEQ 
ID NO: 61 (Figure 48) ; and SEQ ID NO: 63 (Figure 50) . 

25. The expression cassette of claim 23, wherein the 
15 polynucleotide sequence encoding the polypeptide is selected 

from the group consisting of: SEQ ID NO:39 (Figure 26); SEQ 
ID NO:40 (Figure 27); SEQ ID NO:41 (Figure 28); SEQ ID NO:42 
(Figure 29); SEQ ID NO:43 (Figure 30); SEQ ID NO:44 (Figure 
31); SEQ ID NO:45 (Figure 32); SEQ ID N0:46 (Figure 33); and 
20 SEQ ID NO: 47 (Figure 34) . 

26. The expression cassette of claim 14, wherein said 
Env polypeptide includes a gpl60 Env polypeptide or a 
polypeptide derived from a gpl60 Env polypeptide . 

25 

27. The expression cassette of claim 26, wherein the 
polynucleotide sequence encoding the polypeptide is selected 
from the group consisting of: SEQ ID NO:64 (Figure 51); SEQ 
ID NO:65 (Figure 52); SEQ ID NO:66 (Figure 53); SEQ ID NO:67 

30 (Figure 54); SEQ ID NO:68 (Figure 55); SEQ ID NO:75 (Figure 
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63); and SEQ ID NO:73 (Figure 61). 

28. The expression cassette of claim 26, wherein the 
polynucleotide sequence encoding the polypeptide is selected 

5 from the group consisting of: SEQ ID NO:48 (Figure 35); SEQ 
ID NO:49 (Figure 36); SEQ ID NO:50 (Figure 37); SEQ ID NO:76 
(Figure 64) ; and SEQ ID NO: 74 (Figure 62) . 

29. The expression cassette of claim 14, wherein said 
10 Env polypeptide includes a gpl40 Env polypeptide or a 

polypeptide derived from a gpl40 Env polypeptide. 

30. The expression cassette of claim 29, wherein the 
polynucleotide sequence encoding the polypeptide is selected 

15 from the group consisting of: SEQ ID NO: 56 (Figure 43); SEQ 
ID NO:57 (Figure 44); SEQ ID NO:58 (Figure 45); SEQ ID NO:59 
(Figure 46); SEQ ID NO:60 (Figure 47); SEQ ID NO:61 (Figure 
48); SEQ ID NO:62 (Figure 49); and SEQ ID NO:63 (Figure 50). 

20 31. The expression cassette of claim 29, wherein the 

polynucleotide sequence encoding the polypeptide is selected 
from the group consisting of: SEQ ID NO:36 (Figure 23); SEQ 
ID NO:37 (Figure 24); SEQ ID NO:38 (Figure 25); SEQ ID NO:39 
(Figure 26); SEQ ID NO:40 (Figure 27); SEQ ID NO:41 (Figure 

25 28); SEQ ID NO:42 (Figure 29); SEQ ID NO:43 (Figure 30); SEQ 
ID NO:44 (Figure 31); SEQ ID NO:45 (Figure 32); SEQ ID NO:46 
(Figure 33); and SEQ ID N0:47 (Figure 34). 

32. The expression cassette of claim 14, wherein said 
3 0 Env polypeptide includes a gpl2 0 Env polypeptide or a 
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polypeptide derived from a gpl20 Env polypeptide. 

33. The expression cassette of claim 32, wherein the 
polynucleotide sequence encoding the polypeptide is selected 

5 from the group consisting of: SEQ ID NO:54 (Figure 41); and 
SEQ ID NO: 55 (Figure 42) . 

34. The expression cassette of claim 32 , wherein the 
polynucleotide sequence encoding the polypeptide is selected 

10 from the group consisting of: SEQ ID NO:33 (Figure 19); SEQ 
„ ID NO:34 (Figure 20); and SEQ ID NO:35 (Figure 21). 

\j 35. The expression cassette of claim 14, wherein the 

polynucleotide sequence encoding the polypeptide is selected 
U 15 from the group consisting of: SEQ ID NO: 55 (Figure 42); SEQ 
jf" ID NO:62 (Figure 49); SEQ ID NO:63 (Figure 50); and SEQ ID 

H : NO: 68 (Figure 55) . 

36. A recombinant expression system for use in a 
J 20 selected host cell, comprising, an expression cassette of 
claim 1, and wherein said polynucleotide sequence is 
operably linked to control elements compatible with 
expression in the selected host cell. 

25 37. The recombinant expression system of claim 36, 

wherein said control elements are selected from the group 
consisting of a transcription promoter, a transcription 
enhancer element, a transcription termination signal, 
polyadenylation sequences, sequences for optimization of 

3 0 initiation of translation, and translation termination 
sequences . 
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10 



38. The recombinant expression system of claim 36, 
wherein said transcription promoter is selected from the 
group consisting of CMV, CMV+intron A, SV40, RSV, HIV-Ltr, 
MMLV-ltr, and metallothionein. 

39. A cell comprising an expression cassette of claim 
1, and wherein said polynucleotide sequence is operably 
linked to control elements compatible with expression in the 
selected cell. 

40. The cell of claim 39, wherein the cell is a 
mammalian cell. 



41. The cell of claim 40, wherein the cell is selected 
15 from the group consisting of BHK, VERO, HT1080, 293, RD, 

COS- 7, and CHO cells. 

42. The cell of claim 41, wherein said cell is a CHO 

cell. 

20 

43. The cell of claim 39, wherein the cell is an 
insect cell . 

44. The cell of claim 43, wherein the cell is either 
25 Trichoplusia ni (Tn5) or Sf9 insect cells. 

45. The cell of claim 39, wherein the cell is a 
bacterial cell . 

30 46. The cell of claim 39, wherein the cell is a yeast 

cell. 



204 



1621.002 

2302-1621 

PATENT 



47. The cell of claim 39, wherein the cell is a plant 

cell . 



5 48. The cell of claim 39, wherein the cell is an 

antigen presenting cell. 

49. The cell of claim 48, wherein the lymphoid cell is 
selected from the group consisting of macrophage, monocytes, 

10 dendritic cells, B-cells, T-cells, stem cells, and 
progenitor cells thereof. 

50. The cell of claim 39, wherein the cell is a 
primary cell. 

15 

51. The cell of claim 39, wherein the cell is an 
immortalized cell. 

52. The cell of claim 39, wherein the cell is a tumor- 
20 derived cell. 

53. A method for producing a polypeptide including HIV 
Gag polypeptide sequences, said method comprising, 

incubating the cells of claim 39, under conditions for 
25 producing said polypeptide. 

54. A method for producing virus-like particles 
(VLPs) , comprising, 

incubating the cells of claim 39, under conditions for 
3 0 producing said VLPs. 
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55. A method for producing a composition of virus-like 
particles (VLPs) , comprising, 

(a) incubating the cells of claim 39, under conditions 
for producing said VLPs; and 
5 (b) substantially purifying said VLPs to produce a 

composition of VLPs. 

56. A cell line useful for packaging lentivirus 
vectors , comprising 

10 suitable host cells that have been transfected with an 

expression vector containing an expression cassette of claim 
1, and wherein said polynucleotide sequence is operably 
linked to control elements compatible with expression in the 
host cell. 

15 

57. A cell line useful for packaging lentivirus 
vectors , comprising 

suitable host cells that have been transfected with an 
expression vector containing an expression cassette of claim 
20 2, and wherein said polynucleotide sequence is operably 

linked to control elements compatible with expression in the 
host cell. 

58. A cell line useful for packaging lentivirus 
2 5 vectors, comprising 

suitable host cells that have been transfected with an 
expression vector containing an expression cassette of claim 
3, and wherein said polynucleotide sequence is operably 
linked to control elements compatible with expression in the 
30 host cell. 
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vectors , comprising 

suitable host cells that have been transfected with an 
expression vector containing an expression cassette of claim 
11, and wherein said polynucleotide sequence is operably 
5 linked to control elements compatible with expression in the 
host cell. 

60. A gene delivery vector for use in a Mammalian 
subj ect , comprising 

10 a suitable gene delivery vector for use in said 

subject, wherein the vector comprises an expression cassette 
of claim 1, and wherein said polynucleotide sequence is 
operably linked to control elements compatible with 
expression in the subject. 

15 

61. A method of DNA immunization of a subject, 
comprising, 

introducing a gene delivery vector of claim 60 into 
said subject under conditions that are compatible with 
20 expression of said expression cassette in said subject. 

62. The method of claim 61, wherein said gene delivery 
vector is a nonviral vector. 



25 63. The method of claim 61, wherein said vector is 

delivered using a particulate carrier. 

64. The method of claim 63, wherein said vector is 
coated on a gold or tungsten particle and said coated 
3 0 particle is delivered to said subject using a gene gun. 



65. The method of claim 63, wherein said vector is 
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encapsulated in a liposome preparation. 

66. The method of claim 61, wherein said vector is a 
viral vector. 

5 

67. The method of claim 66, wherein said viral vector 
is a retroviral vector. 

68. The method of claim 67, wherein said viral vector 
10 is a lentiviral vector. 

69. The method of claim 61, wherein said subject is a 
mammal . 

15 70. The method of claim 69, wherein said mammal is a 

human . 

71. A method of generating an immune response in a 
subj ect , comprising 

2 0 transfecting cells of said subject a gene delivery 

vector of claim 60, under conditions that permit the 
expression of said polynucleotide and production of said 
polypeptide, thereby eliciting an immunological response to 
said polypeptide. 

25 

72. The method of claim 71, wherein said vector is a 
nonviral vector . 

73. The method of claim 72, wherein said vector is 

3 0 delivered using a particulate carrier. 

74. The method of claim 73, wherein said vector is 
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coated on a gold or tungsten particle and said coated 
particle is delivered to said vertebrate cell using a gene 
gun. 

75. The method of claim 73, wherein said vector is 
encapsulated in a liposome preparation. 

76. The method of claim 71, wherein said vector is a 
viral vector. 

77. The method of claim 76, wherein said viral vector 
is a retroviral vector. 

78. The method of claim 77, wherein said viral vector 
is a lentiviral vector. 

79. The method of claim 71, wherein said subject is a 
mammal . 

80. The method of claim 79, wherein said mammal is a 
human . 

81. The method of claim 71, wherein said transfecting 
is done ex vivo and said transfected cells are reintroduced 
into said subject. 

82. The method of claim 71, wherein said transfecting 
is done in vivo in said subject. 

83. The method of claim 71, where said immune response 
is a humoral immune response. 
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84. The method of claim 71, where said immune response 
is a cellular immune response. 

5 85. A gene delivery vector comprising an alphavirus 

vector construct, wherein said alphavirus construct 
comprises an expression cassette according to claim 1. 

86. The gene delivery vector of claim 85, wherein the 
10 alphavirus vector construct is a cDNA vector construct. 

87. The gene delivery vector of claim 85, wherein the 
alphavirus comprises a recombinant alphavirus particle 
preparation. 

15 

88. The gene delivery vector of claim 85, wherein the 
vector comprises a eukaryotic layered vector initiation 
system. 

20 89. A method of stimulating an immune response in a 

subject comprising administering the gene delivery vector of 
claim 85 in an amount effective to stimulate an immune 
response in said subject. 

25 90. The method of claim 89, wherein the gene delivery 

vector is administered intramuscularly, intramucosally, 
intranasally , subcutaneously , intradermally , transdermall , 
intravaginally, intrarectally, orally or intravenously. 
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IMPROVED EXPRESSION OF HIV POLYPEPTIDES AND 
5 PRODUCTION OF VIRUS-LIKE PARTICLES 

Abstract of the Disclosure 

The present invention relates to the efficient 
10 expression of HIV polypeptides in a variety of cell types, 
including, but not limited to, mammalian, insect, and plant 
cells. Synthetic expression cassettes encoding the HIV Gag- 
containing polypeptides are described, as are uses of the 
expression cassettes in applications including DNA 
15 immunization, generation of packaging cell lines, and 

production of Env-, tat- or Gag- containing proteins. The 
invention provides methods of producing Virus -Like Particles 
(VLPs) , as well as, uses of the VLPs including, but not 
limited to, vehicles for the presentation of antigens and 
2 0 stimulation of immune response in subjects to whom the VLPs 
are administered. 
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orig.gagSF2 

ATGGGTGCGAGAGCGTCGGTATTAAGCGGGGGAGAATTAGATAAATGGGAAAAAATTCGGTTAAGGCCAGGGGGAAAG 



aJaaaatataagttmaacatat^ 



G G C 



G C C 



ACATCAGAAGGCTGCAGACAAATATTGGGACAGCTACAGCCATCCCTTCAGACAGGATCAG7 AGAACTTAGATCATTA 



Inact . 2 



I Inact . 3 

TATAA1 ZVCAGTAGCAACCCTCTATTGTGTACA rCAAAGGATAGATGTAAAJ GACACCAAGGAAGCTTTAGAGAAGATA 

q : gc c c c 



GAGGAAGAGCAAAACA^GTAAGA^ 

GTCC G C G 



AGCCAAAATTACCCTATAGTGCAGAACCTACAGGGGCAAATGGTACATCAGGCCATATCACCTAGAACTTTAAATGCA 
TGGGTAAAAGTAGTAGAAGAAAAGGCTTTCAGCCCAGAAGTAATACCCATGTTTTCAGCATTATCAGAAGGAGCCACC 



Inact 5 

CCAC; AGATTTAAACACCATGCTAAACAC^ GTGGGGGGACATCAAGCAGCCATGCAAATGTTAAAAGAGACTATCAAT 
G CC G G T G C 



GAGGAAGCTGCAGAATGGGATAGAGTGCATCCAGTGCATGCAGGGCCTATTGCACCAGGCCAAATGAGAGAACCAAGG 
GGAAGTGACATAGCAGGAACTACTAGTACCCTTCAGGAACAAATAGGATGGATGACAAATAATCCACCTATCCCAGTA 



C G G 



Inact. 7 



G C G C GG 



ATAAGACAAGGACCAAAGGAACCCTTTAGAGATTATGTAGACCGGTTCTATAAAACTCTAAGAGC 



^3AA< 



Inact . 8 

CAGGATGTAAAAAATTGGATGACAGAAACCTTGTTGGTCCAAAATGCAAACCCAGATTGTAAGAC TATTTTAAAAGCA 

C CC G G T 



CAAGCTTCA 



ttggga:<^gcagctacactagaagaaatgatgacagcatgtcagggagtggggggacccggccataaagcaaga 
Ice c 



ttggctgaagccatgagccaagtaacaaatccagctaacataatgatgcagagaggcaattttaggaaccaaaga^ 
actgttaagtgtttcaattgtggcaaagaagggc^catagccaaaaattgcagggcccctaggaaaaagggctgttgg 

agatgtggaagggaaggacacou^tgaaagattgcactc^ 

tacaagggaaggccagggaattttcttcagagcagaccagagccaacagccccaccagaagagagcttcaggtttggg 
gaggagaaaacaactccctctcagaagcaggagccgatagacaaggaactgtatcctttaacttccctcagatcactc 

tttggcaacgacccctcgtcacaataa 



FIG. 1 



native H TV-1SF2 oaa-protease 

|__— — ^. From here codon optimization + inactivation (GP1) and (GP2) 

ATGGGTGCGAGAGCGTCGGTATTAAGCGGGGGAGAATTAGATAAATGGGAAAAAATTCGGTTAAGGCCAGGGGGAAAG 

a? aaaatataagttaaaacatat; gtatgggcaagcagggagctagaacgattcgcagtcaatcctggcctgttagaa 

\G G C C G C <E 

I Inact. 2 

acatcagaaggctgcagacaaatattgggacagctacagccatcccttcagacaggatcag^agaacttagatcatta 



tataat^ cagtagcaaccctctattgtgtac? tcaaaggatagatgtaaa* gacaccaaggaagctttagagaagata 

C I jc GC C C C 



GAGGAAGAG CAAAACA2 AAGTAAGAAAAAGGCACAGCAA 3CAGCAGCTGCAGCTGGCACAGGAAACAGCAGCCAGGTC 
STCC G C G 

AGC CAAAAT TACC CT ATAGT GC AG AAC CT AC AGGGGC AAAT GGT AC AT CAGGCC AT AT CACC T AG AACTT T AAATGC A 

TGGGTAAAAGTAGTAGAAGAAAAGGCTTTCAGCCCAGAAGTAATACCCATGTTTTCAGCATTATCAGAAGGAGCCACC 

CCAC; AGATTTAAACACCATGCTAAACACJ GTGGGGGGACATCAAGCAGCCATGCAAATGTTAAAAGAGACTATCAAT 
\G CC G G T G £ 

GAGGAAG CTGCAG AATGGGAT AGAGTGCATC CAGTGC ATGC AGGG CCT ATT GC AC CAGG CC AAAT GAGAGAACC AAGG 
GGAAGTGACATAGCAGGAACTACTAGTACCCTTCAGGAACAAATAGGATGGATGACAAATAATCCACCTATCCCAGTA 

ggagJaatctataaaagatggataatc^ VTGTATAGCCCTACCAGCATTCTGGAC 

3 CGG GCGCGG 

ATAAGACAAGGACCAAAGGAACCCTTTAGAGATTATGTAGACCGGTTCTATAAAACTCTAAGAGC^AACAAGCTTCA 

J Inact. 7 

CAGGATGTAAAAAATTGGATGAC^GAAACCTTGTTGGTCCAAAATGCAAACCCAGATTGTAAGAG^ATTTTAAAAGCA 

C CC GGT 

rTGGGACCAGCS 3CTACACTAGAAGAAATGATGACAGCATGTCAGGGAGTGGGGGGACCCGGCCATAAAGCAAGAGTT 

g c 4 — a 



b-k — V ^ . 1 

I Inact. 6 n Inact. 9 

TTGGC TGAAGCCATGAGCCAAGTAACAAATCCAGCT dAtATAATGATGCAGAGAGG :AATTTTAGGAACCAAAGAAAG 
\r. G g G G a c G dfcf I C CC GC_G ] 



Inact 9 I 1 Inact. 10 J 

ACTGTTAAGTGT rTCAATTGTGGCAAAGAAGGGCACJfTAGCCAAAAATTGCAGGGCCCCTAGGAAJ AAGGGCT GTTGG 
C C CI |CC GG C C CC C 



AGATGTGGAAGGGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTJTOAGGGAAGATCTGGCCTTCC 

\ ^ From here no changes to native sequence (GP1) and (GP2) 

TACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGACCAGAGCCAACAGCCCCACCAGAAGAGAGCTTCAGGTTTGGG 

GAGGAGAAAACAACTCCCTCTCAGAAGCAGGAGCCGATAGACAAGGAACTGTATCCTTTAA 

^mmmm^mw^^ From here codon optimization + inactivation (GPl) 

1 1 Inact . XI *>' ' u "ty iiwcUvaUuu (GV1 ) 

TTTGGCAACGACCCCTCGTCACJ ATAAGGATAGGGGGGCAACTAAAGGAAGCTCTATTAGATACAGGAGCAGATGATA 
G C C G GC \. 



Inact. 12 

CAGTATTAGAAGAAATGAATTTGCCAGGAAAATGGAAACCAAAAATGAT AGGGGC AATTGGAGGTTTTATCAAAGTAA 

G C G C C G G 



GACAGTACGAT:AGATACCTGTAGAAATCTGTGGACATAAAGCTATAGGTACAGTATTAGTAGGACCTACACCTGTCA 
G C 



I Inact. 13 T_ I Inact. 14 

ACATAATTGGAAGAAATCTGTTGAC TCAGATTGGTTGTACTTTAAATTTCCCCATTAGTCC TATTGA ^ACTGTACCAG 

£ c c c c g c c c I UL_£ — £ — £_ 

1 aaaattaaagccaggaatggatggcccaaaagt: AAGCAATGGCCATTGTAA 

GGG GG C CG C 




FIGURE 3 





FIG. 5 



Protein level from 
native p55 gene 



Protein level from 
synthetic p55 gene 



FIG. 6 



Kttltipl© Edit3 



GagPol.ModSF 1 
GagProt.ModS 1 
Gag.ModSF2 1 



GagPol.ModSF 51 
GagProt.ModS 51 
Gag.ModSF2 51 



GagPol.ModSF 101 
GagProt.ModS 101 
Gag.ModSF2 101 



10 



20 



30 



40 



50 



VTGGGCGCCCI 3CGCCAGCGTI 3CTGAGCGGC1 3GCGAGCTGG1 ^CAAGTGGG, 
^GGCGCCC|3CGCCAGCGT|3CTGAGCGGC|oGCGAGCTGG|^CMGTGGG. 
VTGGGCGCCCl 3CGCCAGCGT1 3CTGAGCGGcM 3GCGAGCTGGI ^CAAGTGGG, 



60 



3AAGATCCGC1 ^TGCGCCCCG 
3AAGATCCGC1 :TGCGCCCCG 
3AAGATCCGCI 



110 



rCGTGTGGGCI CAGCCGCGAGI ~TGGAGCGCT 
ICGTGTGGGC|^GCCGCGAG|:TGGA!3CGCT 

tcgtgtgggcicagccgcgagI :tggagcgct 





gaaixccggcctg; 
gaaIccccggccxg 

GAAlrCCCGGCCTG 



50 
50 
50 



100 
100 
100 



150 
150 
150 



FIG. 7 



GagPol ,ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



^GagPol.ModSF 
OGagProt .Mods 
,p;Gag.ModSF2 



151 



201 
201 
201 



251 
251 
'251 



160 



151 ^!ffiSTff 



170 



180 



190 



200 



3CGAGGGCXG1 XGCCAGATCl CTGGGCCAGCl IGCAGCCCAG 
30GAGGGCXGI -CGCCAGATCl CTGGGCCAGC1XGCAGCCCAG 



210 



220 



230 



240 



250 



xtgcagaccl 3gcagcgaggi .^gctgcgcagl xtgtacaacl accgtggcg 
otgcagacc|3Gcagcgagg|.^gctgcgcag|:ctgtacaac|accgtggcc 

;CTGCAGACCI 3GCAGCG AGGl \GCTGCGC AG! rCTGTACAACI ACCGTGGCC 



260 



270 



280 



290 



300 



2CCXGTACTGI ^GTGCACCAGl ^GCATCGACGl ICAAGGACACl :AAGGAGGCC; 
CCCTGTACTGI CGTGCACCAGI -GCATCGACgItCAAGGACACI CAAGGAGGCCj 
rCCTGTACTGl rGTGCACCAGl XCATCGACGlTCAAGGACACl 2 AAGGAGGCC 



200 
200 
200 



250 
250 
250 



300 
300 
300 



i GagPol.ModSF 301 
= GagProt-ModS 301 
- Gag.ModSF2 301 



: Z GagPol .ModSF 351 
ITS GagProt.ModS 351 
„ Gag.ModSF2 351 



310 



320 



330 



STGGAGAAGAltCGAGGAGGAl 3CAGAACAAG! 
2TGG AGAAGaItCG AGGAGGA1 3CAGAACAAG 
rrGGAGAAGAlTCGAGGAGGAI 3CAGAACAAG 



360 



3GCCGCCGCCI 3CCGCCGGI 
3GOOGCOGOC|3CCGCOGGi 
3GCCGCCGCCI 3CCGCCGG 



GagPol.ModSF 401 
GagProt.ModS 401 
Gag.ModSF2 401 



GagPoLModSF 451 
GagProt.ModS 451 
Gag.ModSF2 451 



370 



340 




380 



350 



AGGCCCAGC 
AGGCCCAGC 
AGGCCCftGC 



5 



390 



400 



31 



2CGGCAACAGICAGCCAGGTG1AGCCAGAACT) 
-CGGCAACAGl 3AGCCAGGXG1 AGCCAGAACTj 
CCGGCAACAGICAGCCAGGTGlAGCCAGAACTi 



410 



420 



ACCCCATCGT|3CAGAACCTG| 
ACCCCATCGTI 3CAGAACCTG 
ACCCCATCGTI 3CAGAACCTG 




440 



450 



rGGTGCACCA|3GCCATCAGC 
IGGTGCACCAI 3GCCATCAGC 
rGGTGCACCAl 3GCCATCAGC 



460 



470 



480 



490 



500 



CCCCGCACCCl TGAACGCCTGl 3GTGAAGGTGI3TGGAGGAGAI ^GGCCTTCAG; 
XCCGCACCC|tGAACGCCTG|3GTGAAGGTCJ3XGGAGGAGA|^GGCCTTCAG 
OCCCGCACCCl TGAACGCCTGl 3GTGAAGGTGI 3TGGAGGAGAI \G<2CCTTC&G 



350 
350 
350 



400 
400 
400 



450 
450 
450 



500 
500 
500 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



501 
501 
501 



510 



520 



S30 



540 



550 



ZCCCGAGGTG|.\TCCCCATGT|rCAGCGCCCT|3AGCGAGGGC|3CCACCCCCC; 
CCCCGAGGXG|\K:CCCATGr|rCAGCGCCCT|3AGCGAGGGC:|3CCACCCCCC 
3CCCGAGGTGI ^TCCCCATGTI TCAGCGCCCTI 3AGCGAGGGCI 3CCACCCCCC 



550 
550 
550 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



551 
551 
551 



560 



570 



580 



590 



600 



AGGACCTGAAl ^ACGATGTTGl AACACCGTGGI 3CGGCCACCAI 3GCCGCCATG 
AGGACCTG Afil CACGATGTTGl .^ACACCGTGGI 3CGGCCACCA1 3GCCGCCATG' 
^GGACCTGAAl ^ACGATGTTGl ^CACCGTGGI 3CGGCCACCAI 3GCCGCCATG 



600 
600 
600 



GagPol -ModSF 
GagProt.ModS 
Gag.ModSF2 



601 
601 
601 



610 



CAGATGCTGA 
CAGATGCTGA 
CAGATGCTGA 



620 




630 



640 



650 



CAACGAGGAGISCCGCCGAGTlSGGACCGCGT 



2AACGAGGAGI 3CCGCCG AGT 



3ACCGCGT 



CAACGAGGAGl 3CCGCCG AGTl 3GG ACCGCGTi 



650 
650 
650 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



651 
651 
651 



660 



670 



680 



690 



700 



3caccccgtg|^acgccggcc|:catcgcccc|2ggccagatg|:gcgagccoc| 
3CACCCCGtg|:acgccggcc|:catcgcccc|:ggccagatg|:gcgagcccci 
3caccccgtgloacgccggccircatcgcccclcggccagatgicgcgagcccc 



700 
700 
700 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



701 
701 
701 



710 



720 



730 



740 



750 



3CGGCAGCGA10ATCGCCGGC1 ACCACCAGCAl XCTGCAGGAl 3CAGATCGGC 
3CGGCAGCGA|CATCGCCGGc|aCCACCAGCA|:CCTGCAGG^3CAGATCGGC| 

scggcagcgaI^atcgccggcIaccaccagcaI^cctgcaggaIscagatcggo 



750 
750 
750 



GagPol -ModSF 
GagProt.ModS , 
Gag.ModSF2 / 




770 



780 



790 



800 



^caacccccc|;atccccgtg|3gcgagatcti *caagcggtg 

^CAACCCCCcIrATCCCCGTGlSGCGAGATCXl^CAAGCGGTG 
^CAACCCCCCf rATCCCCGTGI 3GCGAG ATCTI =£AAGCGGTG 



800 
800 
800 



VUKSZS 

Multiple Edit: 3 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
Ul GagProt.ModS 
s Gag.ModSF2 



fj« GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



%! GagPol-ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



801 
801 
801 



651 
851 

851 



901 
901 
901 



951 
951 
951 



1001 
1001 
1001 



1051 
1051 
1051 



1101 
1101 
1101 



810 



satcatcctg; 
3atcatcctg 
3atcatcctg 



860 



820 



830 




840 



^GATCGTGOGI 3ATGTACAGC1 
^GATCGTGCGIsATGTACAGC; 
^GATCGTGCGI 3ATGTACAGC 




870 



880 



ICCTGGACATl ^CGCCAGGGCl -CCAAGGAGC 
rCCTGGACATI XGCCAGGGCI XCAAGGAGC 
TCCTGGACATI XGCCAGGGCl XCAAGGAGC 



890 



900 



910 



920 



930 



940 



.^GACCCTGCGl CGCTG AGCAGI 3CCAGCCAGG 

agaccctgcgI:gctgagcag|3Ccagocagg; 

\G ACCCTGCGl CGCTG AGCAGI 3CCAGCCAGG 




960 



970 



980 



990 



1000 



2TGGATGACCI 3AGACCCTGCI IGGTGCAGAAl CGCCAACCCCl 3ACTGCAAG, 
CTOGATSA0C|3ftfiAC0CIGC|rG6TOCftl^^ 

CTGGATGACCl 3AGACCCTGCI TGGTGCAGAAl CGCCAACCCCl 3ACTGCAAG 



1010 i020 



3GCTCTCGGC 
3GCTCTCGGC 
3GCTCTCGGC 



1060 1070 




1030 



1040 




1050 



3ATGATGACC 
3ATGATGACC 
3ATGATGACC 



1080 



1090 



1100 



3cctgccaggi 3cgtgggcggi xccggccacf.^aggcccgcg 
3CCTgccagg|3cgtgggcggI;cccggccac|^aggcccgcg 

3CCTGCCAGGI3CGTGGGCGGlrCCCGGCCAcI^GGCCCGCG 




1110 1120 



3GCGATGAGC 
3GCGATGAGC 
3GCGATGAGC 



1160 




1130 



1140 



1150 



AOGCGGCGAC1 ZATCATGATGl : AGCGCGGC 
ACCCGGCGAC|2ATCATGATG|CAGCGCGGC 
ACCCGGCGACl 2 ATCATGATGI -AGCGCGGC. 



1170 




1180 



1190 



1200 



^CAGCGGAAG|^CCGTCAAGT|3CTTCAACTG|:GGCAAGGAG| 

ccagcggaag1 .^ccgtcaagti 3cttcaactg* cggcaaggagj 
xagcggaagI accgtcaagtIgcttcaactgIcggcaaggag; 



1210 



1220 



1201 
1201 
1201 



1230 



1240 



1250 



3gccacaccg|cc^ggaacxg|xgcgccccc|rgcaagaagg|3ctgctggcg 
3gccacaccgi -caggaactgi ccgcgcccccl 3gcaagaagg1 3ctgctggcg 
sgccacaccgIzcaggaactgIxgcgcccccIogcaagaaggIsctgctggcg 




1401 
1401 
1401 



1451 
1451 
1451 



-ViCTGTATCC 
AACTGTATCC 
AgCTGTAgCC 



1510 



1520 




1530 



1540 



1550 



3ATCGGCGGC|:AGCTCAAGG|^GGCGCTGCT|CGACACCGGCj 
GATCGGCGGCl CAGCTCAAGGlAGGCGCTGCTI 2GACACCGGC 



1580 



3GAGATGAAC 
3GAGATGAAC 



1590 



1600 



.^GTGGAAGCC 
^GTGGAAGCC 



850 
850 
850 



90(3 
900 
900 



950 
950 
950 



1000 
1000 
1000 



1050 
1050 
1050 



1100 
1100 
1100 



1150 
1150 
1150 



1200 
1200 
1200 



1250 
1250 
1250 



1300 
1300 
1300 



1350 
1350 
1350 



1400 
1400 
1400 



1450 
1450 
1450 



1500 
1500 
1500 



1550 
1550 
1550 



1600 
1600 
1600 



FIG. 7 (cont'd.) 



DNA6IS 

Multiple Edit 3 



GaoPol^ModSF 1601 
GagProt.ModS 1601 
Gag.ModSF2 1601 



GagPol.ModSF 1651 
GagProt.ModS 1651 
Gag.ModSF2 1651 



GagPol,ModSF 1701 
GagProt.ModS 1701 
Gag.ModSF2 1701 



GagPol.ModSF 1751 
GagProt.ModS 1751 
Gag.ModSF2 1751 



1610 1620 1630 



1640 



1650 



:AAGATGATC| 3GCGGGATCGI 3GGGCTTCAT1 CAAGGTGCGGl ^AGTACGACCj 
2AAGA7GATCI 3GCGGGATCGI 3GGGCTTCATI 2AAGGTGCGGI ~AGTACGACC 



1660 1670 1680 



1690 



1700 



.^GATCCCCGTI 3GAGATCTGC1 3GCCACAAGG1 3CATOGGCAC1 -GTGCTGGTG] 
*GAT000CGTI 3GAGATCTGCI 3GCCACAAGG1 3CATCGGC AC! rGTGCTGGTG 



1710 1720 1730 



1740 



3GCCCCACCC|XGTGAACAT|^TCGGCCGC|^CCTGCTGA|XCAGATOGG; 
SGCOC^CClCOSTGAACATlrAT^ 



1750 



CTGCACCCTG 
OTGCACCCTG 



1760 1770 1780 



1790 



1800 



ICAGCCCCAT|CGAGACGGTG|XCGTGAAGCj 
tCAGCCOCATI 3GAGACGGTGI 2CCGTGAAGC; 



1650 
1650 
1650 



1700 
1700 
1700 



1750 
1750 
1750 



1800 
1800 
1800 



FIG. 7 (cont'd.) 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



^: GagPol.ModSF 
S GagProt.ModS 
Gag.ModSF2 



1810 



1801 
1801 
1801 



1851 
1851 
1851 



1820 



IGAAGCCGGGI3ATGGACGGC 
TGAAGCCGGGi 3ATGGACGGC 



1830 



1840 



1850 



AGCAGTGGCC 
^GCAGTGGCC 



1860 1870 1880 1890 1900 

GAGAAGATCA AGGCCCTGGT .GGAGATCTGC ACCGAGATO3 AGAAGSftGGG 



1910 1920 1930 1940 1950 

1901 CAftGATCAGC AAGATCQGCC CCGAGAACOC CTACAACAOC CC0C3ICTT0G 

1901 

1901 

1960 1970 1980 1990 2000 

1951 CCATCAAGAA GAAGGACAGC ACCAAGTGGC GCAAQCT3GT GGACTTCCGC 

1951 

1951 

2010 2020 2030 2040 2050 

2001 GAGCTGAACA AGOGCACCCA GGACTTCK3G GAGGTGCAGC TOCXfCftTCOC 

2001 

2001 

. 2060 2070 2080 2090 2100 

2051 OCAOOOC3GOC QQCCTGAAGA AGAAGAAGAG OC7IGACOGTG CTGGACX3TGG 

2051 . 

2051 



1850 
1850 
1850 



1900 
1900 
1900 



1950 
1950 
1950 



2000 
2000 
2000 



2050 
2050 
2050 



2100 
2100 
2100 



2110 2120 2130 2140 2150 

GagPol.ModSF 2101 GCGAOGCCTA CITCAQ0GT3 CCXXTOGACA AGGACTTCCX3 CAAGTACACC 2150 

GagProt.ModS 2101 2150 

Gag.ModSF2 2101 2150 



2160 2170 2180 2190 2200 

GagPol.ModSF 2151 GCCTTCACCA TCCXXAGCAT CAACAACGAG ACCCCCGGCA T033CTACCA 2200 

GagProt.ModS 2151 2200 

Gag.ModSF2 2151 2200 



2210 2220 2230 2240 2250 

GagPol.ModSF 2201 GTACAACGTG CTGCCCCAGG GCTGGAAGGG CAQCCOOGOC ATCTTCCAGA 2250 

GagProt.ModS 2201 2250 

Gag.ModSF2 2201 2250 



2260 2270 2280 2290 2300 

GagPol.ModSF 2251 GCAGCATGAC CAAGATCCTG GAGCCCTTCC GCAAGCAGAA CXXXX3ACATC 2300 

GagProt.ModS 2251 2300 

Gag.ModSF2 2251 2300 



2310 2320 2330 2340 2350 

GagPol.ModSF 2301 GTGATCTACC AGTACATGGA CGACCTGTAC GTGGGCAGCG ACCTGGAGAT 2350 

GagProt.ModS 2301 2350 

Gag.ModSF2 2301 2350 

2360 2370* 2380 2390 2400 

GagPol.ModSF 2351 CGGCCAGCAC CGCACCAAGA TCGAGGAGCT GCGCCAGCAC CTQCTGQGCT 2400 

GagProt.ModS 2351 „ 2400 

Gag.ModSF2 2351 2400 



DKKSX6 

Multiple Edit 3 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag,ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



2410 2420 2430 2440 2450 

2401 Q3Q3CTTCAC CAOCXXXXSftC AAGAAGCAOC AGAAGGAGCC CCCCTTCCTG 2450 

2401 2450 

2401 2450 

2460 2470 2480 2490 2500 

2451 TGGATGQQCT ACGAGCTQCA CCCCGACAAG TQGA.CCGTGC AGCCCATCAT 2500 

2451 2500 

2451 2500 

2510 2520 2530 2540 2550 

2501 GCTCCCX3GAG AAGGACAGCT GGAOCGTGAA OGACATCCAG AAGCTO3TO3 2550 

2501 2550 

2501 2550 

2560 2570 2580 2590 2600 

2551 GCAAGCTGAA CT3GGCCAGC CAGATCTAOG CCQGCATCAA GGTGAAGCAG 2600 

2551 2600 

2551 2600 

2610 2620 2630 2640 2650 

2601 CT3TGCAAGC TOCTGOQOGG CAOCAAGGCC CTGAOOGAGG TGA.TCCCCCT 2650 

2601 2650 

2601 2650 

2660 2670 2680 2690 2700 

2651 GAOOGAGGAG GOOGAGCTOG AGCTQGCCGA GAACOGOGAG ATCCTGAACG 2700 

2651 .J 2700 

2651 2700 

2710 2720 2730 2740 2750 

2701 AGCCCCTGCA CX5AGGTCTAC TACGACCCCA GCAAGGACCT GGTCGOOGAG 2750 

2701 2750 

2701 2750 

2760 2770 2780 2790 2800 

2751 ATCCAGAAGC AGGGCCAGGG CCAGTGGACC TACCAGATCT ACXAQGAGOC 2800 

2751 2800 

2751 2800 

2810 2820 2830 2840 2850 

2801 CTTCAAGAAC CTGAAGACOG GCAAGTAOGC CXX3CMX3CGC GGCGOTACA 2850 

2801 2850 

2801 2850 

2860 2870 2880 2890 2900 

2851 OCAACGACGT GAAGCAGCTC ACGGAGGCOG TQCAGAAGGT GAGGAOGGAG 2900 

2851 2900 

2851 2900 

2910 2920 2930 2940 2950 

2901 AGCATCGTGA TCTCQQGCAA GATCOOCAAG TTCAAQCTQC CCATCCAGAA 2950 

2901 2950 

2901 2950 

2960 2970 2980 2990 3000 

2951 GGAGACCTQG GAG3CCTQGT GGATGGAGTA CTGQCAGGOC AOCTQGATCC 3000 

2951 3000 

2951 3000 

3010 3020 3030 3040 3050 

3001 CXXSAGTGGGA GTTCGTGAAC ACCCCCCCCC TGGTGAAGCT GTQC3TAOCAG 3050 

3001 3050 

3001 3050 

3060 3070 3080 3090 3100 

3051 CTGGAGAAGG AGCCCATCGT GGGCGCOGAG ACCTTCTAOG TGGACGGCGC 3100 

3051 3100 

3051 3100 

3110 3120 3130 3140 3150 

3101 CGOCAACCGC GAGACCAAGC TOQGCAAGGC CGGCTACGTG ACCGACCGCG 3150 

3101 3150 

3101 3150 

3160 3170 3180 3190 3200 

3151 GCCGCCAGAA GGTQG1GAGC ATCGCOGACA CCACCAACCA GAAGAGOGAG 3200 

3151 3200 

3151 3200 



DNASXS 

Multiple Edit 3 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol .ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



GagPol.ModSF 
rCagProt«Mods 
*JtSag.ModSF2 



^GagPol.ModSF 
,J5agProt.ltods 
llGag.ModSF2 

i 

^GagPol .ModSF 
yiGagProt .Mods 
~ Gag.ModSF2 



~ GagPol .ModSF 
s ^GagProt.ModS 
yi Gag.ModSF2 



GagPol.ModSF 
;;GagProt.ModS 
' Gag.ModSF2 



GagPol.ModSF 
GagProt.ModS 
Gag.ModSF2 



3210 3220 3230 3240 3250 

3201 CIK3CAGGCCA TCCACCTQGC CCTCCAGGAC AGOQQOCTQG AGGTGAACAT 3250 

3201 3250 

3201 3250 

3260 3270 3280 3290 3300 

3251 OGTGAOCGAC AGCCAGTACG CXXTGGGCAT CATQCfiGGOC CAQCCCX5ACA 3300 

32S1 . 3300 

3251 3300 

3310 3320 3330 3340 3350 

3301 AGAGCGAGAG CGAGCTGGTG AGCCAGATCA TCGAGCAGCT GATCAAGAAG 3350 

3301 3350 

3301 3350 

3360 3370 3380 3390 3400 

3351 GAGAAGGTGT ACCTGGCCTG GGTQCOCGCC CACAAGQGCA TCQGCQGCAA 3400 

3351 3400 

3351 3400 

3410 3420 3430 3440 3450 

3401 CGAGCAGCTG GACAAGCTGG TGAQOGCOGG CATOCGCAAG GTGCTGTTCC 3450 

3401 3450 

3401 ; 3450 

3460 3470 3480 . 3490 3500 

3451 TGAACGOCAT CGACAAQGCC CAGGAQGAGC ACGAGAAGTA CCACAGCAAC 3500 

3451 ' 3500 

3451 3500 

3510 3520 3530 3540 3550 

3501 V3QCGCGCCA TGGCCAGOGA CTTCAACCTC COCXXX3GKK3G TGGCCAAGGA 3550 

3501 3550 

3501 . 3550 

3560 3570 3580 3590 3600 

3551 GATCGTOGCC AGCTGCGACA AGTGCCAGCT GAAOGGCGAG GCCATGCACG 3600 

3551 ' 3600 

3551 3600 

3610 3620 3630 3640 3650 

3601 GCCAGGTGGA CTGCAGCCCC QGCATCVGQC AGCTGGACTG CACCCACCTG 3650 

3601 3650 

3601 36S0 

3660 3670 3680 3690 3700 

3651 GAGQGCAAfiA TCATCCTQGT GGOCCIQCAC GTQGCGAJ30G GCmCKTCGA 3700 

3651 .... 3700 

3651 3700 

3710 3720 3730 3740 3750 

3701 GGCCGflGGTG ATCCCGGCOG AGACCGGCCA GGAGAOOGOC TACTTCCTGC 3750 

3701 *. 3750 

3701 3750 



FIG. 7 (cont'd.) 



3760 3770 3780 3790 3800 

GagPol.ModSF 3751 TGAAGCTGGC CX3GCCGCTQG CCOGTGAAGA CCATCCACAC OSACAACGGC 3800 

GagProt.ModS 3751 3800 

Gag.ModSF2 3751 3800 

3810 3820 3830 3840 3850 

GagPol.ModSF 3801 AGCAACTTCA CCAGCACCAC CGTGAAGGCC GCCTGCTGGT G3GCCX3GCAT 3850 

GagProt.ModS 3801 3850 

Gag.ModSF2 3801 3850 

3860 3870 3880 3890 3900 

GagPol.ModSF 3851 CAAGCW3GAG TTCGGCATCC CCTACAACCC CCAGAGCCAG GQOGTQGTGG 3900 

GagProt.ModS 3851 3900 

Gag.ModSF2 3851 3900 

3910 3920 3930 3940 3950 

GagPol.ModSF 3901 AGAGCATGAA CAACGAGCTG AAGAAGATCA TO3GCCAGGT C5CGCGACCAG 3950 

GagProt.ModS 3901 3950 

Gag.ModSF2 3901 3950 

3960 3970 3980 3990 4000 

GagPol.ModSF 3951 GCCGAGCACC TQAAGACCGC CGTGCW3ATC GCCX3TGTTCA TCCACAACTT 4000 

GagProt.ModS 3951 4000 

Gag.ModSF2 3951 4000 



ouxsxe 

Multiple Edit 3 



4010 4020 4030 4040 4050 

GagPol.ModSF 4001 CAAQOGCAAG GQOOQCATOS GOOQC7ACAG CGCCQGCGAG CX3CATCX3TX3G 4050 

GagProt.ModS 4001 4050 

Gao - ModSF2 4001 4050 FIG. 7 (cont'd.) 

4060 4070 4080 4090 . 4100 

GagPol.ModSF 4051 ACMCATOGC CACCGACATC CAGACCAAGG AGCTGCAGAA GCAGATCACC 4100 

GagProt.ModS 4051 4100 

Gag.ModSF2 4051 4100 

4110 4120 4130 4140 4150 

GagPol.ModSF 4101 AAGATOCAGA ACTTCCX3CGT GTACTAOOGC GACAACAAGG ACCCCCTGrTC 4150 

GagProt.ModS 4101 4150 

Gag.ModSF2 4101 4150 

4160 4170 4180 4190 4200 

GagPol.ModSF 4151 GAAGGQOCCC GCCAAQCTGSC TGTGGAAGGG CGAQGGOGCC CTOGTGATCC 4200 

GagProt.ModS 4151 4200 

Gag.ModSF2 4151 4200 

4210 4220 4230 4240 4250 

GagPol.ModSF 4201 AGGACAACAG OGACATCAAG GTQGTGCCCC GCCGCAAGGC CAAGATCATC 4250 

GagProt.ModS 4201 4250 

Gag.ModSF2 4201 4250 



_ GagPol.ModSF 
1 GagProt.ModS 
=,Gag.ModSF2 



: GagPol.ModSF 
j GagProt.ModS 
1 Gag.ModSF2 



4260 4270 4280 4290 4300 

4251 OGOGACTAOG GCAAGCAGAT GGCCGGCGAC GACTC3CGTQG CCAGCOGCCA 4300 

4251 4300 

4251 4300 

4310 4320 4330 4340 4350 

4301 GGACGAGGAC TAG 4350 

4301 ; 4350 

4301 4350 



Cn*ct 
i TATAAJJ 



to*ct.l 

AA ykAgTAJAAGJTAAAACAJAT^ TTATGGGCAAGCAGGGAGCTAGAACGATTCGCAGTCAATCCTGGCCTGTTXGAX 
ACATCAGAAGGCTG<*GACAAATATTG«^CAGCTACAG<XATCC^^ 



lTCAGA ^GAgCTT^GATCAJTJ 



iGTAGCAACCCTCTATTGTGTJ 



r iM6t.j i 

'ACMTCAAAGGATAGATGTAAAn3A< 
c gg c n d 



tCACCAAGGAAGCTTTAGAGAAGATA 



GAGGAAGAGCAAAA< 



,CA?t^GTAAGAAAA^6GCACAGCA^3CA( 

farce c c q 



lGCAGCTGCAGCTGGCACAGGAAACAGCAGCCAGGTC 



AGCCAAAATTACCCTATAGTGCAGAACCTACAGGGGCAAATGGT^ 
TGGGTAAAAGTAGTAGAAGAAAAGGCTTTCAGCCCAGAAGTAATACCCATGTTTTCA 
CCA< 



,GATTTAAA^X?8AflCTA\ACAC?fGTGGG<X5GAC ATC AAGCAGCC AT GCAAATGTT AAAAGAGACT ATCAAT 
CCC C T C < 



GAGGAAGClX^GAATGGGATAGAGTG<^TCCAGTGCATGCAGG^ 
GGAAGTGACATAGCAGGAACTACTAGTACCCTTCAGGAACAAATAGGATG^ 

TGTATAGCCCTACCAGCATTCTGGAC 



kTCTATAAAAGATGGAT 
CCC 



iTTAAATAAAATAGTA 

s c c c c cl 



ATAAGACAAGGACCAAAGGAACCCTTTAGAGATTATGT^ 
CAGGATGTAAAAAATTGGATGACAGAAACCTTGTTGGTCCAAAAT^ 



I ifii6tn — i 

XxrATTTTAAAAGCAj 



TTG t 5G , A&AGCA iCTACACTAGAAGAAATGATGACAGCATGTCAGGGAGTGGGGGGACC^ 
CCC c 



tn*ct.8 
_ iTGAGCCAAGTAACAAATCCAGCTi 

c c c c c c c 



tnact,9 

ACTGTTAAGTGT|rrCAATTGTCGCAAAGAAGOGCA< 



3^' 



TAATGATGCAGAGAGG ^ATTTTAGGAACCAAAGAAAG 
C CC GC C 



lCa|a| 



AGGCAAAAATTGCAGGG<XCCTAGGAA/KAGGG<^rGTrGG 
" C C CC c 



AGATGTGGAAGGGAAGGACACCAAATGAAAGATTGCACTGAGAGAC^ 
TACAAGGGAAGGCCAC^GAATTTTCm^G^ 

GAGGAGAAAACAACTCCCTCTCAGAAGCAGGAGCCGATAGACAAGGA^ 



TTTGGCAACGACCCCTCGTCACA&TAAGGATAGGGGGGC^ 

C C G CC ' 



lTACAGGAGCAGATGATA 



^ Inscfc.lS 

CAGTATTAGAAGAAATGAATTTGCCAGGAAAATGGAAACCAAAAATGATAG© 



C G C C 



c c 



^GAXACCTGTAG^TCTGXGGACATiU^ 



In»ctA3 

ACATAATTGGAAGAAATCTGTTGAC rCAGATTCGTIGTACTITMATTTCCCCATTAGTCC rATTGA tACTGTACCAG 
. C C C C C C C C ~ " 



kCTGTACCA 
G G G C 



lTTAAAGCCAGGAATGGATGGCCCAAAAGT^AGCAATGGCCATTGACA 



c c 



c c 



CCC 



AGATATGTACAGAAATGGAAAAGGAAGGGAAAATTTCAAAAATTGGG<XTGAAAATC^ 

CTATAAAGAAAAAAGACAGTACTAAAtGGAGAAAACTAGTAGATTTCAGAGAAOT 

GGGAAGTTCACTTAGGAATACCAGACCCCGCAGGGTT^ 

CATACTTTTCAGTTCCCTTAGATAAAGACTTTAGAAACT 

CAGGGATIAGATATCAGTACAATGTGCIGCCACAGGGATC^ 

AAATCTTAGAGCCTTTTAGAAAACACAAICCAGACATAGTTATCT 

ACTIAGAAATAGGGCAGCATAGAACAAAAATAGAGGAACTGAG^ 

ACAAAAAACATX^GAAAGAACCTCCATTCCTTTGGATGGGTTAT^ 

TAATGCTGCCAGAAAAAGACAGCTGGACTGTCAATGACATA 

TTTATGCAGGGATTAAAGTAMGCAGTTATXyTAAACTCCTT'AG^ 

CAGAAGAAGCAGAGCTAGAACTGGCAGAAAACAGGGAGATTCTAAAAGAACCAGTACAT^ 

CAAAAGACTTAGTAGCAGAAATACAGAAGCAGGGGCAAGGCCAATGGA 

ATCTGAAAACAGGAAAOTATGCAA<^TGAG<5G€TGC^ 

AAGTATCCACAGAAAGCATAGTAATATGGGGAAAGATTCCTAAATrTAAACT 

CATGGTGGATGGAGTATTGGCAAGCTACCTGGATTCCTGAGT^^ 

GGTACCAGTTAGAGAAAGAAOXATAGTAGGAGCAGAAACTTTCTATCT^ 

TAGGAAAAGCAGGATATGTTACTGACAGAGGAAGACAAAAAGTTOTCTCCATA^ 

AATTACAAGCAATTCATCTAGCTTTGCAGGATTCGGGAT^ 

(yu^TCATTCAAGCACAACCAGATAAGAGTGAATCAGAGTTAGTC^ 

AGGTCTACCTGGCATGGGTACCAGCACACAAAGGAATTC 

TX^GGAAAGTACTATTTTTGAATGGAATAGATAAGGCCCAAGAAGAACATGAGA 
TGGCTAGTGATTTTAA<XTGCCACCTGTAGTAGCAAAAGAAA 
AAGCCXTGCATGGACAAGTAGACTGTAGTCCA<MtfU^T^ 
TGGTAGCAGTTCATGTAGCCAGTGGATATATAGA^ 
TTCTCTTAAAATTAGCAG<yVAGATGGCCAGTAAAAACAATACATACAG^ 
TTAAGGCCGCCTGTTGGTCGGCAGGGATCAAGCAGGAATTTGG<^ 
AATCTATGAATAATGAATTAAAGAAAATTATAG<ACAGGTAA<»GA^ 
TGGCACTATTCATCCACAATTTTAAAAGAAAAGGGGGGATTX^^ 
TAGCAACAGACATACAAACTAAAGAACTACAAAAG<yiAATTACAAAAATTCAAA^ 
ACAAAGATCCCCTTTGGAAAOGACCAGCAAAGCTTCTCTOGAAAG^ 
ACATAAAAGTAGTGCCAAGAAGAAAAGCAAAAATCATTAGGGATTATGGAA^ 
CAAGTAGACAGGATGAGGATTAG 
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gpl20wtSF162 



GTAGAAAAATTGTGGGTCACAGTCTATTATGGGGTACCTGTGTGGAAAGAAGCAACCACCACTCTATTTT 
GTGCATCAGATGCTAAAGCCTATGACACAGAGGTACATAATGTCTGGGCCACACATGCCTGTGTACCCAC 
AGACCCTAACCCACAAGAAATAGTATTGGAAAATGTGACAGAAAATTTTAACATGTGGAAAAATAACATG 
GTAGAACAGATGCATGAGGATATAATCAGTTTATGGGATCAAAGTCTAAAGCCATGTGTAAAGTTAACCC 
CACTCTGTGTTACTCTACATTGCACTAATTTGAAGAATGCTACTAATACCAAGAGTAGTAATTGGAAAGA 
GATGGACAGAGGAGAAATAAAAAATTGCTCTTTCAAGGTCACCACAAGCATAAGAAATAAGATGCAGAAA 
GAATATGCACTTTTTTATAAACTTGATGTAGTACCAATAGATAATGATAATACAAGCTATAAATTGATAA 
ATTGTAACACCTCAGTCATTACACAGGCCTGTCCAAAGGTATCCTTTGAACCAATTCCCATACATTATTG 
TGCCCCGGCTGGTTTTGCGATTCTAAAGTGTAATGATAAGAAGTTCAATGGATCAGGACCATGTACAAAT 
GTCAGCACAGTACAATGTACACATGGAATTAGGCCAGTAGTGTCAACTCAATTGCTGTTAAATGGCAGTC 
TAGCAGAAGAAGGGGTAGTAATTAGATCTGAAAATTTCACAGACAATGCTAAAACTATAATAGTACAGCT 
GAAGGAATCTGTAGAAATTAATTGTACAAGACCTAACAATAATACAAGAAAAAGTATAACTATAGGACCG 
GGGAGAGCATTTTATGCAACAGGAGACATAATAGGAGATATAAGACAAGCACATTGTAACATTAGTGGAG 
AAAAATGGAATAACACTTTAAAACAGATAGTTACAAAATTACAAGCACAATTTGGGAATAAAACAATAGT 
CTTTAAGCAATCCTCAGGAGGGGACCCAGAAATTGTAATGCACAGTTTTAATTGTGGAGGGGAATTTTTC 
TACTGTAATTCAACACAGCTTTTTAATAGTACTTGGAATAATACTATAGGGCCAAATAACACTAATGGAA 
CTATCACACTCCCATGCAGAATAAAACAAATTATAAACAGGTGGCAGGAAGTAGGAAAAGCAATGTATGC 
CCCTCCCATCAGAGGACAAATTAGATGCTCATCAAATATTACAGGACTGCTATTAACAAGAGATGGTGGT 
AAAGAGATCAGTAACACCACCGAGATCTTCAGACCTGGAGGTGGAGATATGAGGGACAATTGGAGAAGTG 
AATTATATAAATATAAAGTAGTAAAAATTGAGCCATTAGGAGTAGCACCCACCAAGGCAAAGAGAAGAGT 
GGTGCAGAGAGAAAAAAGA 



FIG. 16 (SEQID NO: 30) 



gpl40wtSF162 

GTAGAAAAATTGTGGGTCACAGTCTATTATGGGGTACCTGTGTGGAAAGAAGCAACCACCACTCTATTTT 

GTGCATCAGATGCTAAAGCCTATGACACAGAGGTACATAATGTCTGGGCCACACATGCCTGTGTACCCAC 

AGACCCTAACCCACAAGAAATAGTATTGGAAAATGTGACAGAAAATTTTAACATGTGGAAAAATAACATG 

GTAGAACAGATGCATGAGGATATAATCAGTTTATGGGATCAAAGTCTAAAGCCATGTGTAAAGTTAACCC 

CACTCTGTGTTACTCTACATTGCACTAATTTGAAGAATGCTACTAATACCAAGAGTAGTAATTGGAAAGA 

GATGGACAGAGGAGAAATAAAAAATTGCTCTTTCAAGGTCACCACAAGCATAAGAAATAAGATGCAGAAA 

GAATATGCACTTTTTTATAAACTTGATGTAGTACCAATAGATAATGATAATACAAGCTATAAATTGATAA 

ATTGTAACACCTCAGTCATTACACAGGCCTGTCCAAAGGTATCCTTTGAACCAATTCCCATACAT-TATTG 

TGCCCCGGCTGGTTTTGCGATTCTAAAGTGTAATGATAAGAAGTTCAATGGATCAGGACCATGTACAAAT 

GTCAGCACAGTACAATGTACACATGGAATTAGGCCAGTAGTGTCAACTCAATTGCTGTTAAATGGCAGTC 

TAGCAGAAGAAGGGGTAGTAATTAGATCTGAAAATTTCACAGACAATGCTAAAACTATAATAGTACAGCT 

GAAGGAATCTGTAGAAATTAATTGTACAAGACCTAACAATAATACAAGAAAAAGTATAACTATAGGACCG 

GGGAGAGCATTTTATGCAACAGGAGACATAATAGGAGATATAAGACAAGCACATTGTAACATTAGTGGAG 

AAAAATGGAATAACACTTTAAAACAGATAGTTACAAAATTACAAGCACAATTTGGGAATAAAACAATAGT 

CTTTAAGCAATCCTCAGGAGGGGACCCAGAAATTGTAATGCACAGTTTTAATTGTGGAGGGGAATTTTTC 

TACTGTAATTCAACACAGCTTTTTAATAGTACTTGGAATAATACTATAGGGCCAAATAACACTAATGGAA 

CTATCACACTCCCATGCAGAATAAAAGA^TTATAAACAGGTGGCAGGAAGTAGGAAAAGCAATGTATGC 

CCCTCCCATCAGAGGACAAATTAGATGCTCATCAAATATTACAGGACTGCTATTAACAAGAGATGGTGGT 

AAAGAGATCAGTAACACCACCGAGATCTTCAGACCTGGAGGTGGAGATATGAGGGACAATTGGAGAAGTG 

AATTATATAAATATAAAGTAGTAAAAATTGAGCCATTAGGAGTAGCACCCACCAAGGCAAAGAGAAGAGT 

GGTGCAGAGAGAAAAAAGAGCAGTGACGCTAGGAGCTATGTTCCTTGGGTTCTTGGGAGCAGCAGGAAGC 

ACTATGGGCGCACGGTCACTGACGCTGACGGTACAGGCCAGACAATTATTGTCTGGTATAGTGCAACAGC 

AGAACAATTTGCTGAGAGCTATTGAGGCGCAACAGCATCTGTTGCAACTCACAGTCTGGGGCATCAAGCA 

GCTCCAGGCAAGAGTCCTGGCTGTGGAAAGATACCTAAAGGATCAACAGCTCCTAGGGATTTGGGGTTGC 

TCTGGAAAACTCATTTGCACCACTGCTGTGCCTTGGAATGCTAGTTGGAGTAATAAATCTCTGGATCAGA 

TTTGGAATAACATGACCTGGATGGAGTGGGAGAGAGAAATTGACAATTACACAAACTTAATATACACCTT 

AATTGAAGAATCGCAGAACCAACAAGAAAAGAATGAACAAGAATTATTAGAATTGGATAAGTGGGCAAGT 

TTGTGGAATTGGTTTGACATATCAAAATGGCTGTGGTATATA 



FIG. 17 (SEQIDN0:31) 



gpl60wtSF162 



GTAGAAAAATTGTGGGTCACAGTCTATTATGGGGTACCTGTGTGGAAAGAAGCAACCACCACTCTATTTT 

GTGCATCAGATGCTAAAGCCTATGACACAGAGGTACATAATGTCTGGGCCACACATGCCTGTGTACCCAC 

AGAC C CTAACC CACAAGAAATAGTATTGGAAAATGTGACAGAAAATTTTAACATGTGGAAAAATAACATG 

GTAGAACAGATGCATGAGGATATAATCAGTTTATGGGATCAAAGTCTAAAGCCATGTGTAAAGTTAACCC 

CACTCTGTGTTACTCTACATTGCACTAATTTGAAGAATGCTACTAATACCAAGAGTAGTAATTGGAAAGA 

GATGGACAGAGGAGAAATAAAAAATTGCTCTTTCAAGGTCACCACAAGCATAAGAAATAAGATGCAGAAA 

GAATATGCACTTTTTTATAAACTTGATGTAGTACCAATAGATAATGATAATACAAGCTATAAATTGATAA 

ATTGTAACACCTCAGTCATTACACAGGCCTGTCCAAAGGTATCCTTTGAACCAATTCCCATACATTATTG 

TGCCCCGGCTGGTTTTGCGATTCTAAAGTGTAATGATAAGAAGTTCAATGGATCAGGACCATGTACAAAT 

GTCAGCACAGTACAATGTACACATGGAATTAGGCCAGTAGTGTCAACTCAATTGCTGTTAAATGGCAGTC 

TAGCAGAAGAAGGGGTAGTAATTAGATCTGAAAATTTCACAGACAATGCTAAAACTATAATAGTACAGCT 

GAAGGAATCTGTAGAAATTAATTGTACAAGACCTAACAATAATACAAGAAAAAGTATAACTATAGGACCG 

GGGAGAGCATTTTATGCAACAGGAGACATAATAGGAGATATAAGACAAGCACATTGTAACATTAGTGGAG 

AAAAATGGAATAACACTTTAAAACAGATAGTTACAAAATTACAAGCACAATTTGGGAATAAAACAATAGT 

CTTTAAGCAATCCTCAGGAGGGGACCCAGAAATTGTAATGCACAGTTTTAATTGTGGAGGGGAATTTTTC 

TACTGTAATTCAACACAGCTTTTTAATAGTACTTGGAATAATACTATAGGGCCAAATAACACTAATGGAA 

CTATCACACTCCCATGCAGAATAAAACAAATTATAAACAGGTGGCAGGAAGTAGGAAAAGCAATGTATGC 

CCCTCCCATCAGAGGACAAATTAGATGCTCATCAAATATTACAGGACTGCTATTAACAAGAGATGGTGGT 

AAAGAGATCAGTAACACCACCGAGATCTTCAGACCTGGAGGTGGAGATATGAGGGACAATTGGAGAAGTG 

AATTATATAAATATAAAGTAGTAAAAATTGAGCCATTAGGAGTAGCACCCACCAAGGCAAAGAGAAGAGT 

GGTGCAGAGAGAAAAAAGAGCAGTGACGCTAGGAGCTATGTTCCTTGGGTTCTTGGGAGCAGCAGGAAGC 

ACTATGGGCGCACGGTCACTGACGCTGACGGTACAGGCCAGACAATTATTGTCTGGTATAGTGCAACAGC 

AGAACAATTTGCTGAGAGCTATTGAGGCGCAACAGCATCTGTTGCAACTCACAGTCTGGGGCATCAAGCA 

GCTCCAGGCAAGAGTCCTGGCTGTGGAAAGATACCTAAAGGATCAACAGCTCCTAGGGATTTGGGGTTGC 

TCTGGAAAACTCATTTGCACCACTGCTGTGCCTTGGAATGCTAGTTGGAGTAATAAATCTCTGGATCAGA 

TTTGGAATAACATGACCTGGATGGAGTGGGAGAGAGAAATTGACAATTACACAAACTTAATATACACCTT 

AATTGAAGAATCGCAGAACCAACAAGAAAAGAATGAACAAGAATTATTAGAATTGGATAAGTGGGCAAGT 

TTGTGGAATTGGTTTGACATATCAAAATGGCTGTGGTATATAAAAATATTCATAATGATAGTAGGAGGTT 

TAGTAGGTTTAAGGATAGTTTTTACTGTGCTTTCTATAGTGAATAGAGTTAGGCAGGGATACTCACCATT 

ATCATTTCAGACCCGCTTCCCAGCCCCAAGGGGACCCGACAGGCCCGAAGGAATCGAAGAAGAAGGTGGA 

GAGAGAGACAGAGACAGATCCAGTCCATTAGTGCATGGATTATTAGCACTCATCTGGGACGATCTACGGA 

GCCTGTGCCTCTTCAGCTACC7VCCGCTTGAGAGACTTAATCTTGATTGCAGCGAGGATTGTGGAACTTCT 

GGGACGCAGGGGGTGGGAAGCCCTCAAGTATTGGGGGAATCTCCTGCAGTATTGGATTCAGGAACTAAAG 

AATAGTGCTGTTAGTTTGTTTGATGCCATAGCTATAGCAGTAGCTGAGGGGACAGATAGGATTATAGAAG 

TAGCACAAAGAATTGGTAGAGCTTTTCTCCACATACCTAGAAGAATAAGACAGGGCTTTGAAAGGGCTTT 

GCTATAA 



FIG. 18 (SEQ ID NO: 3 2) 



gpl20.modSF162 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 
aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 
agcttcaaggtgaccaccagcatccgcaacaagatgcagaaggagtacgccctgttctacaagctg 
gacgtggtgcccatcgacaacgacaacaccagctacaagctgatcaactgcaacaccagcgtgatc 
acccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttc 
gccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtg 
cagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgag 
gagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaag 
gagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcaccatcggcccc 
ggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagc 
ggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaag 
accatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggc 
ggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacaccatcggcccc 
aacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggag 
gtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaacatcaccggc 
ctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggc 
ggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctg 
ggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagcgctaactcgag 



FIG. 19 (SEQ ID N0:33) 



gpl20.modSF162.delV2 

gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 
aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 
agcttcaaggtgggcgccggcaagctgatcaactgcaacaccagcgtgatcacccaggcctgcccc 
aaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttcgccatcctgaagtgc 
aacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggc 
atccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatc 
cgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagatc 
aactgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgccttctac 
gccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgagaagtggaac 

n aacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaagaccatcgtgttcaag 
cagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggcggcgagttcttctac 

£ tgcaacagcacccagctgttcaacagcacctggaacaacaccatcggccccaacaacaccaacggc 

%i accatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggaggtgggcaaggccatg 
tacgccccccccatccgcggccagatccgctgcagcagcaacatcaccggcctgctgctgacccgc 
gacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggcggcgacatgcgcgac 

ft aactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctgggcgtggcccccacc 

^" aaggccaagcgccgcgtggtgcagcgcgagaagcgctaactcgag 



FIG. 20 (SEQ ID N0:34) 



gpl20 •modSF162 .delVlV2 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgggcgccggcaactgccagacc 
agcgtgatcacccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgccccc 
gccggcttcgccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtg 
agcaccgtgcagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagc 
ctggccgaggagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtg 
cagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcacc 
atcggccccggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgc 
aacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttc 
ggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttc 
aactgcggcggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacacc 
atcggccccaacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgc 
tggcaggaggtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaac 
atcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgc 
cccggcggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatc 
gagcccctgggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagcgctaactc 
gag 



FIG. 2 1 (SEQ ID NO: 35) 
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gpl40.modSF162 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 
aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 
agcttcaaggtgaccaccagcatccgcaacaagatgcagaaggagtacgccctgttctacaagctg 
gacgtggtgcccatcgacaacgacaacaccagctacaagctgatcaactgcaacaccagcgtgatc 
acccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttc 
gccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtg 
c^tgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgag 
9^g9gcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaag 
gagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcaccatcggcccc 
ggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagc 
ggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaag 
accatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggc 
ggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacaccatcggcccc 
aacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggag 
gtgggca-^ggccatgtacgccccccccatccgcggccagatccgctgcagcagcaacatcaccggc 
ctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggc 
ggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctg 
ggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtgaccctgggc 
gccatgttcctgggcttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgacc 
gtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgag 
gcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggcc 
gtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgc 
accaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctggaacaacatg 
acctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctgatcgaggag 
agccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggccagcctgtgg 
aactggttcgacatcagcaagtggctgtggtacatctaactcgag 



FIG. 23 (SEQ ID N0:36) 



gpl40 .modSF162 .delV2 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 
aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 
agcttcaaggtgggcgccggcaagctgatcaactgcaacaccagcgtgatcacccaggcctgcccc 
a-aggtgagcttcgagcccatccccatccactactgcgcccccgccggcttcgccatcctgaagtgc 
aacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggc 
atccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatc 
cgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagatc 
aactgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgccttctac 
gccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgagaagtggaac 
aacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaagaccatcgtgttcaag 
cagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggcggcgagttcttctac 
tgcaacagcacccagctgttcaacagcacctggaacaacaccatcggccccaacaacaccaacggc 
accatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggaggtgggcaaggccatg 
tacgccccccccatccgcggccagatccgctgcagcagcaacatcaccggcctgctgctgacccgc 
gacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggcggcgacatgcgcgac 
aactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctgggcgtggcccccacc 
a-aggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtgaccctgggcgccatgttcctgggc 
ttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgaccgtgcaggcccgccag 
ctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgaggcccagcagcacctg 
ctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggccgtggagcgctacctg 
aaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgcaccaccgccgtgccc 
tggaacgccagctggagcaacaagagcctggaccagatctggaacaacatgacctggatggagtgg 
gagcgcgagatcgacaactacaccaacctgatctacaccctgatcgaggagagccagaaccagcag 
gagaagaacgagcaggagctgctggagctggacaagtgggccagcctgtggaactggttcgacatc 
agcaagtggctgtggtacatctaactcgag 



FIG. 24 (SEQ ID NO: 37) 



gpl4 0 .modSF16 2 . delVlV2 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgggcgccggcaactgccagacc 
agcgtgatcacccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgccccc 
gccggcttcgccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtg 
agcaccgtgcagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagc 
ctggccgaggagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtg 
cagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcacc 
atcggccccggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgc 
aacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttc 
ggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttc 
aactgcggcggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacacc 
atcggccccaacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgc 
tggcaggaggtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaac 
atcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgc 
cccggcggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatc 
gagcccctgggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtg 
accctgggcgccatgttcctgggcttcctgggcgccgccggcagcaccatgggcgcccgcagcctg 
accctgaccgtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgc 
gccatcgaggcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgc 
gtgctggccgtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaag 
ctgatctgcaccaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctgg 
aacaacatgacctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctg 
atcgaggagagccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggcc 
agcctgtggaactggttcgacatcagcaagtggctgtggtacatctaactcgag 



FIG. 25 (SEQ ID N0:38) 



gpl40 .mut .iaodSF162 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 
aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 
agcttcaaggtgaccaccagcatccgcaacaagatgcagaaggagtacgccctgttctacaagctg 
gacgtggtgcccatcgacaacgacaacaccagctacaagctgatcaactgcaacaccagcgtgatc 
acccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttc 
gccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtg 
cagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgag 

gagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcat^ 

gagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcaccatcggcccc 
ggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagc 
ggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaag 
accatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggc 
ggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacaccatcggcccc 
aacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggag 

gtgggcaaggccatgtacgccccccccatccgcggccagatcc^ 

ctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggc 
ggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctg 
ggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagagcgccgtgaccctgggc 
gccatgttcctgggcttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgacc 
gtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgag 
gcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggcc 
gtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgc 
accaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctggaacaacatg 
acctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctgatcgaggag 
agccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggccagcctgtgg 
aactggttcgacatcagcaagtggctgtggtacatctaactcgag 



FIG. 26 (SEQ ID N0:39) 



gpl40 .mut.modSF162.del.V2 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 
aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaa.gaactgc 
agcttcaaggtgggcgccggcaagctgatcaactgcaacaccagcgtgatcacccaggcctgcccc 
aaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttcgccatcctgaagtgc 
aacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggc 
atccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatc 
cgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagatc 
aactgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgccttctac 
gccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgagaagtggaac 
O aacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaagaccatcgtgttcaag 
; B cagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggcggcgagttcttctac 
4 s ; tgcaacagcacccagctgttcaacagcacctggaacaacaccatcggccccaacaacaccaacggc 
accatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggaggtgggcaaggccatg 
tacgccccccccatccgcggccagatccgctgcagcagcaacatcaccggcctgctgctgacccgc 
gacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggcggcgacatgcgcgac 
aactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctgggcgtggcccccacc 
aaggccaagcgccgcgtggtgcagcgcgagaagagcgccgtgaccctgggcgccatgttcctgggc 
ttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgaccgtgcaggcccgccag 
m ctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgaggcccagcagcacctg 
yj ctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggccgtggagcgctacctg 
O aaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgcaccaccgccgtgccc 
tggaacgccagctggagcaacaagagcctggaccagatctggaacaacatgacctggatggagtgg 
# gagcgcgagatcgacaactacaccaacctgatctacaccctgatcgaggagagccagaaccagcag 
gagaagaacgagcaggagctgctggagctggacaagtgggccagcctgtggaactggttcgacatc 
agcaagtggctgtggtacatctaactcgag 



FIG. 27 (SEQ ID NO: 40) 



gpl40 .mut .modSF162 . delVlV2 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 

gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 

gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgggcgccggcaactgccagacc 

agcgtgatcacccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgccccc 

gccggcttcgccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtg 

agcaccgtgcagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagc 

ctggccgaggagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtg 

cagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcacc 

atcggccccggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgc 

aacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttc 

ggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttc 

aactgcggcggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacacc 

atcggccccaacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgc 

tggcaggaggtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaac 

atcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgc 

cccggcggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatc 

gagcccctgggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagagcgccgtg 

accctgggcgccatgttcctgggcttcctgggcgccgccggcagcaccatgggcgcccgcagcctg 

accctgaccgtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgc 

gccatcgaggcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgc 

gtgctggccgtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaag 

ctgatctgcaccaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctgg 

aacaacatgacctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctg 

atcgaggagagccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggcc 

agcctgtggaactggttcgacatcagcaagtggctgtggtacatctaactcgag 



FIG. 28 (SEQ ID N0:41) 



gpl40 .mut7 .modSFl62 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 

gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 

gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 

aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 

agcttcaaggtgaccaccagcatccgcaacaagatgcagaaggagtacgccctgttctacaagctg 

gacgtggtgcccatcgacaacgacaacaccagctacaagctgatcaactgcaacaccagcgtgatc 

acccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttc 

gccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtg 

cagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgag 

gagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaag 

ga.gagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcaccatcggcccc 

ggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagc 

ggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaag 

accatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggc 

ggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacaccatcggcccc 

aacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggag 

gtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaacatcaccggc 

ctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggc 

ggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctg 

ggcgtggcccccaccaaggccatcagcagcgtggtgcagagcgagaagagcgccgtgaccctgggc 

gccatgttcctgggcttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgacc 

gtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgag 

gcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggcc 

gtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgc 

accaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctggaacaacatg 

acctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctgatcgaggag 

agccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggccagcctgtgg 

aactggttcgacatcagcaagtggctgtggtacatctaactcgag 



FIG. 29 (SEQ ID N0:42) 



gpl40 .mut7 •modSF162 . delV2 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 
aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 
agcttcaaggtgggcgccggcaagctgatcaactgcaacaccagcgtgatcacccaggcctgcccc 
aaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttcgccatcctgaagtgc 
aacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggc 
atccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatc 
cgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagatc 
aactgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgccttctac 
gccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgagaagtggaac 
aacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaagaccatcgtgttcaag 
cagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggcggcgagttcttctac 
tgcaacagcacccagctgttcaacagcacctggaacaacaccatcggccccaacaacaccaacggc 
accatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggaggtgggcaaggccatg 
tacgccccccccatccgcggccagatccgctgcagcagcaacatcaccggcctgctgctgacccgc 
gacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggcggcgacatgcgcgac 
aactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctgggcgtggcccccacc 
aaggccatcagcagcgtggtgcagagcgagaagagcgccgtgaccctgggcgccatgttcctgggc 
ttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgaccgtgcaggcccgccag 
ctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgaggcccagcagcacctg 
ctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggccgtggagcgctacctg 
aaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgcaccaccgccgtgccc 
tggaacgccagctggagcaacaagagcctggaccagatctggaacaacatgacctggatggagtgg 
gagcgcgagatcgacaactacaccaacctgatctacaccctgatcgaggagagccagaaccagcag 
gagaagaacgagcaggagctgctggagctggacaagtgggccagcctgtggaactggttcgacatc 
agcaagtggctgtggtacatctaactcgag 



FIG. 30 (SEQ ID N0:43) 



gpl40 .mut7 .modSF162 .delVlV2 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 

gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 

gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgggcgccggcaactgccagacc 

agcgtgatcacccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgccccc 

gccggcttcgccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtg 

agcaccgtgcagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagc 

ctggccgaggagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtg 

cagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcacc 

atcggccccggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgc 

aacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttc 

ggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttc 

aactgcggcggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacacc 

atcggccccaacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgc 

tggcaggaggtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaac 

atcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgc 

cccggcggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatc 

gagcccctgggcgtggcccccaccaaggccatcagcagcgtggtgcagagcgagaagagcgccgtg 

accctgggcgccatgttcctgggcttcctgggcgccgccggcagcaccatgggcgcccgcagcctg 

accctgaccgtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgc 

gccatcgaggcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgc 

gtgctggccgtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaag 

ctgatctgcaccaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctgg 

aacaacatgacctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctg 

atcgaggagagccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggcc 

agcctgtggaactggttcgacatcagcaagtggctgtggtacatctaactcgag 



FIG. 31 (SEQ ID NO: 44) 



gpl40 .mut8 .modSF162 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 

gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 

gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 

aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 

agcttcaaggtgaccaccagcatccgcaacaagatgcagaaggagtacgccctgttctacaagctg 

gacgtggtgcccatcgacaacgacaacaccagctacaagctgatcaactgcaacaccagcgtgatc 

acccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttc 

gccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtg 

cagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgag 

gagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaag 

gagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcaccatcggcccc 

ggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagc 

ggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaag 

accatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggc 

ggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacaccatcggcccc 

aacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggag 

gtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaacatcaccggc 

ctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggc 

ggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctg 

ggcgtggcccccaccatcgccatcagcagcgtggtgcagagcgagaagagcgccgtgaccctgggc 

gccatgttcctgggcttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgacc 

gtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgag 

gcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggcc 

gtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgc 

accaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctggaacaacatg 

acctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctgatcgaggag 

agccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggccagcctgtgg 

aactggttcgacatcagcaagtggctgtggtacatctaactcgag 



FIG. 32 (SEQ ID N0:45) 



gpl40 .mutS .modSF162 .delV2 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 

gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 

gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 

aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 

agcttcaaggtgggcgccggcaagctgatcaactgcaacaccagcgtgatcacccaggcctgcccc 

aaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttcgccatcctgaagtgc 

aacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggc 

atccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatc 

cgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagatc 

aactgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgccttctac 

gccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgagaagtggaac 

aacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaagaccatcgtgttcaag 

cagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggcggcgagttcttctac 

tgcaacagcacccagctgttcaacagcacctggaacaacaccatcggccccaacaacaccaacggc 

accatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggaggtgggcaaggccatg 

tacgccccccccatccgcggccagatccgctgcagcagcaacatcaccggcctgctgctgacccgc 

gacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggcggcgacatgcgcgac 

aactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctgggcgtggcccccacc 

atcgccatcagcagcgtggtgcagagcgagaagagcgccgtgaccctgggcgccatgttcctgggc 

ttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgaccgtgcaggcccgccag 

ctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgaggcccagcagcacctg 

ctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggccgtggagcgctacctg 

aaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgcaccaccgccgtgccc 

tggaacgccagctggagcaacaagagcctggaccagatctggaacaacatgacctggatggagtgg 

gagcgcgagatcgacaactacaccaacctgatctacaccctgatcgaggagagccagaaccagcag 

gagaagaacgagcaggagctgctggagctggacaagtgggccagcctgtggaactggttcgacatc 

agcaagtggctgtggtacatctaactcgag 



FIG. 33 (SEQ ID N0:46) 



gpl40 .mut8 .modSF162 .delVlV2 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 

gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 

gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgggcgccggcaactgccagacc 

agcgtgatcacccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgccccc 

gccggcttcgccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtg 

agcaccgtgcagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagc 

ctggccgaggagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtg 

cagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcacc 

atcggccccggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgc 

aacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttc 

ggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttc 

aactgcggcggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacacc 

atcggccccaacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgc 

tggcaggaggtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaac 

atcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgc 

cccggcggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatc 

gagcccctgggcgtggcccccaccatcgccatcagcagcgtggtgcagagcgagaagagcgccgtg 

accctgggcgccatgttcctgggcttcctgggcgccgccggcagcaccatgggcgcccgcagcctg 

accctgaccgtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgc 

gccatcgaggcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgc 

gtgctggccgtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaag 

ctgatctgcaccaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctgg 

aacaacatgacctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctg 

atcgaggagagccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggcc 

agcctgtggaactggttcgacatcagcaagtggctgtggtacatctaactcgag 



FIG. 34 (SEQ ID NO: 47) 



gpl60 .modSF162 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 

gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 

gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 

aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 

agcttcaaggtgaccaccagcatccgcaacaagatgcagaaggagtacgccctgttctacaagctg 

gacgtggtgcccatcgacaacgacaacaccagctacaagctgatcaactgcaacaccagcgtgatc 

acccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttc 

gccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtg 

cagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgag 

gagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaag 

gagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcaccatcggcccc 

ggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagc 

ggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaag 

accatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggc 

ggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacaccatcggcccc 

aacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggag 

gtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaacatcaccggc 

ctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggc 

ggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctg 

ggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtgaccctgggc 

gccatgttcctgggcttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgacc 

gtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgag 

gcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggcc 

gtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgc 

accaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctggaacaacatg 

acctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctgatcgaggag 

agccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggccagcctgtgg 

aactggttcgacatcagcaagtggctgtggtacatcaagatcttcatcatgatcgtgggcggcctg 

gtgggcctgcgcatcgtgttcaccgtgctgagcatcgtgaaccgcgtgcgccagggctacagcccc 

ctgagcttccagacccgcttccccgccccccgcggccccgaccgccccgagggcatcgaggaggag 

ggcggcgagcgcgaccgcgaccgcagcagccccctggtgcacggcctgctggccctgatctgggac 

gacctgcgcagcctgtgcctgttcagctaccaccgcctgcgcgacctgatcctgatcgccgcccgc 

atcgtggagctgctgggccgccgcggctgggaggccctgaagtactggggcaacctgctgcagtac 

tggatccaggagctgaagaacagcgccgtgagcctgttcgacgccatcgccatcgccgtggccgag 

ggcaccgaccgcatcatcgaggtggcccagcgcatcggccgcgccttcctgcacatcccccgccgc 

atccgccagggcttcgagcgcgccctgctgtaactcgag 



FIG. 35 (SEQ ID N0:48) 



gpl60 .modSF162 • delV2 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 

gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 

gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 

aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 

agcttcaaggtgggcgccggcaagctgatcaactgcaacaccagcgtgatcacccaggcctgcccc 

aaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttcgccatcctgaagtgc 

aacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtgcagtgcacccacggc 

atccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatc 

cgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagatc 

aactgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgccttctac 

gccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgagaagtggaac 

aacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaagaccatcgtgttcaag 

cagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggcggcgagttcttctac 

tgcaacagcacccagctgttcaacagcacctggaacaacaccatcggccccaacaacaccaacggc 

accatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggaggtgggcaaggccatg 

tacgccccccccatccgcggccagatccgctgcagcagcaacatcaccggcctgctgctgacccgc 

gacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggcggcgacatgcgcgac 

aactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctgggcgtggcccccacc 

aaggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtgaccctgggcgccatgttcctgggc 

ttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgaccgtgcaggcccgccag 

ctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgaggcccagcagcacctg 

ctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggccgtggagcgctacctg 

aaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgcaccaccgccgtgccc 

tggaacgccagctggagcaacaagagcctggaccagatctggaacaacatgacctggatggagtgg 

gagcgcgagatcgacaactacaccaacctgatctacaccctgatcgaggagagccagaaccagcag 

gagaagaacgagcaggagctgctggagctggacaagtgggccagcctgtggaactggttcgacatc 

agcaagtggctgtggtacatcaagatcttcatcatgatcgtgggcggcctggtgggcctgcgcatc 

gtgttcaccgtgctgagcatcgtgaaccgcgtgcgccagggctacagccccctgagcttccagacc 

cgcttccccgccccccgcggccccgaccgccccgagggcatcgaggaggagggcggcgagcgcgac 

cgcgaccgcagcagccccctggtgcacggcctgctggccctgatctgggacgacctgcgcagcctg 

tgcctgttcagctaccaccgcctgcgcgacctgatcctgatcgccgcccgcatcgtggagctgctg 

ggccgccgcggctgggaggccctgaagtactggggcaacctgctgcagtactggatccaggagctg 

aagaacagcgccgtgagcctgttcgacgccatcgccatcgccgtggccgagggcaccgaccgcatc 

atcgaggtggcccagcgcatcggccgcgccttcctgcacatcccccgccgcatccgccagggcttc 

gagcgcgccctgctgtaactcgag 



FIG. 36 (SEQ ID N0:49) 



gpl60 .modSF162 .delVlV2 



gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgggcgccggcaactgccagacc 
a gcgtgatcacccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgccccc 
gccggcttcgccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtg 
agcaccgtgcagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagc 
ctggccgaggagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtg 
cagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcacc 
atcggccccggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgc 
aacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttc 
ggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttc 
aactgcggcggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacacc 
atcggccccaacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgc 
tggcaggaggtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaac 
atcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgc 
cccggcggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatc 
gagcccctgggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtg 
accctgggcgccatgttcctgggcttcctgggcgccgccggcagcaccatgggcgcccgcagcctg 
accctgaccgtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgc 
gccatcgaggcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgc 
gtgctggccgtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaag 
ctgatctgcaccaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctgg 
aacaacatgacctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctg 
atcgaggagagccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggcc 
agcctgtggaactggttcgacatcagcaagtggctgtggtacatcaagatcttcatcatgatcgtg 
ggcggcctggtgggcctgcgcatcgtgttcaccgtgctgagcatcgtgaaccgcgtgcgccagggc 
tacagccccctgagcttccagacccgcttccccgccccccgcggccccgaccgccccgagggcatc 
gaggaggagggcggcgagcgcgaccgcgaccgcagcagccccctggtgcacggcctgctggccctg 
atctgggacgacctgcgcagcctgtgcctgttcagctaccaccgcctgcgcgacctgatcctgatc 
gccgcccgcatcgtggagctgctgggccgccgcggctgggaggccctgaagtactggggcaacctg 
ctgcagtactggatccaggagctgaagaacagcgccgtgagcctgttcgacgccatcgccatcgcc 
gtggccgagggcaccgaccgcatcatcgaggtggcccagcgcatcggccgcgccttcctgcacatc 
ccccgccgcatccgccagggcttcgagcgcgccctgctgtaactcgag 



FIG. 37 (SEQ ID NO: 50) 



gpl20wtUS4 

ACAACAGTCTTGTGGGTCACAGTCTATTATGGGGTACCTGTGTGGAAAGAAG 

CAACCACCACTCTGTTTTGTGCATCAGATGCTAAAGCATACAAAGCAGAGGC 

ACATAACGTCTGGGCTACACATGCCTGTGTACCCACAGACCCCAACCCACAG 

GAAGTAAATTTAACAAATGTGACAGAAAATTTTAACATGTGGAAAAATAACA 

TGGTGGAACAGATGCATGAGGATATAATCAGTTTATGGGATCAAAGCCTAAA 

GCCATGTGTAAAATTAACCCCACTCTGTGTTACTTTAAATTGTACTGATAAGT 

TGACAGGTAGTACTAATGGCACAAATAGTACTAGTGGCACTAATAGTACTAG 

TGGCACTAATAGTACTAGTACTAATAGTACTGATAGTTGGGAAAAGATGCCA 

GAAGGAGAAATAAAAAACTGCTCTTTCAATATCACCACAAGTGTAAGAGATA 

AAGTGCAGAAAGAATATTCTCTCTTCTATAAACTTGATGTAGTACCAATAGAT 

AATGATAATGCTAGCTATAGATTGATAAATTGTAATACCTCAGTCATTACACA 

AGCCTGTCCAAAGGTATCTTTTGAACCAATTCCCATACATTATTGTGCCCCGG 

CTGGTTTTGCGATTCTAAAGTGTAAAGATAAGAAGTTCAATGGAACAGGACC 

ATGTAAAAATGTCAGCACAGTACAATGCACACATGGAATTAGACCAGTAGTA 

TCAACTCAACTGCTGTTAAATGGCAGTCTAGCAGAAGAAGAGATAGTACTTA 

GATCTGAAAATTTCACAGACAATGCTAAAACCATAATAGTACAGCTGAATGA 

ATCTGTAGAAATTAATTGTATAAGACCCAACAATAATACAAGAAAAAGTATA 

CATATAGGACCAGGGAGAGCATTTTATGCAACAGGTGATATAATAGGAGACA 

TAAGACAAGCACATTGTAACATTAGTAAAGCAAACTGGACTAACACTTTAGA 

ACAGATAGTTGAAAAATTAAGAGAACAATTTGGGAATAATAAAACAATAATC 

TTTAATTCATCCTCAGGAGGGGACCCAGAAATTGTATTTCACAGTTTTAATTG 

TGGAGGGGAATTTTTCTATTGTAATACATCACAACTATTTAATAGTACCTGGA 

ATATTACTGAAGAGGTAAATAAGACTAAAGAAAATGACACTATCATACTCCC 

ATGCAGAATAAGACAAATTATAAACATGTGGCAAGAAGTAGGAAAAGCAAT 

GTATGCCCCTCCCATCAGAGGACAAATTAAATGTTCATCAAATATTACAGGG 

CTGCTATTAACTAGAGATGGTGGTACTAACAATAATAGGACGAACGACACCG 

AGACCTTCAGACCTGGGGGAGGAAACATGAAGGACAATTGGAGAAGTGAAT 

TATATAAATATAAAGTAGTAAGAATTGAACCATTAGGAGTAGCACCCACCCA 

GGCAAAGAGAAGAGTGGTGCAAAGAGAGAAAAGA 



FIG. 38 (SEQ ID NO: 51) 



gpl40wtUS4 



ACAACAGTCTTGTGGGTCACAGTCTATTATGGGGTACCTGTGTGGAAAGAAG 

CAACCACCACTCTGTTTTGTGCATCAGATGCTAAAGCATACAAAGCAGAGGC 

ACATAACGTCTGGGCTACACATGCCTGTGTACCCACAGACCCCAACCCACAG 

GAAGTAAATTTAACAAATGTGACAGAAAATTTTAACATGTGGAAAAATAACA 

TGGTGGAACAGATGCATGAGGATATAATCAGTTTATGGGATCAAAGCCTAAA 

GCCATGTGTAAAATTAACCCCACTCTGTGTTACTTTAAATTGTACTGATAAGT 

TGACAGGTAGTACTAATGGCACAAATAGTACTAGTGGCACTAATAGTACTAG 

TGGCACTAATAGTACTAGTACTAATAGTACTGATAGTTGGGAAAAGATGCCA 

GAAGGAGAAATAAAAAACTGCTCTTTCAATATCACCACAAGTGTAAGAGATA 

AAGTGCAGAAAGAATATTCTCTCTTCTATAAACTTGATGTAGTACCAATAGAT 

AATGATAATGCTAGCTATAGATTGATAAATTGTAATACCTCAGTCATTACACA 

AGCCTGTCCAAAGGTATCTTTTGAACCAATTCCCATACATTATTGTGCCCCGG 

CTGGTTTTGCGATTCTAAAGTGTAAAGATAAGAAGTTCAATGGAACAGGACC 

ATGTAAAAATGTCAGCACAGTACAATGCACACATGGAATTAGACCAGTAGTA 

TCAACTCAACTGCTGTTAAATGGCAGTCTAGCAGAAGAAGAGATAGTACTTA 

GATCTGAAAATTTCACAGACAATGCTAAAACCATAATAGTACAGCTGAATGA 

ATCTGTAGAAATTAATTGTATAAGACCCAACAATAATACAAGAAAAAGTATA 

CATATAGGACCAGGGAGAGCATTTTATGCAACAGGTGATATAATAGGAGACA 

TAAGACAAGCACATTGTAACATTAGTAAAGCAAACTGGACTAACACTTTAGA 

ACAGATAGTTGAAAAATTAAGAGAACAATTTGGGAATAATAAAACAATAATC 

TTTAATTCATCCTCAGGAGGGGACCCAGAAATTGTATTTCACAGTTTTAATTG 

TGGAGGGGAATTTTTCTATTGTAATACATCACAACTATTTAATAGTACCTGGA 

ATATTACTGAAGAGGTAAATAAGACTAAAGAAAATGACACTATCATACTCCC 

ATGCAGAATAAGACAAATTATAAACATGTGGCAAGAAGTAGGAAAAGCAAT 

GTATGCCCCTCCCATCAGAGGACAAATTAAATGTTCATCAAATATTACAGGG 

CTGCTATTAACTAGAGATGGTGGTACTAACAATAATAGGACGAACGACACCG 

AGACCTTCAGACCTGGGGGAGGAAACATGAAGGACAATTGGAGAAGTGAAT 

TATATAAATATAAAGTAGTAAGAATTGAACCATTAGGAGTAGCACCCACCCA 

GGCAAAGAGAAGAGTGGTGCAAAGAGAGAAAAGAGCAGTGGGACTAGGAG 

CTTTGTTCATTGGGTTCTTGGGAGCAGCAGGAAGCACTATGGGCGCAGCGTC 

AGTGACGCTGACGGTACAGGCCAGACAATTATTGTCTGGTATAGTGCAACAG 

CAGAACAATTTGCTGAGAGCTATTGAGGCGCAACAGCATCTGTTGCAACTCA 

CGGTCTGGGGCATCAAACAGCTCCAGGCAAGAATCCTGGCTGTGGAAAGATA 

CCTAAAGGATCAACAGCTCCTAGGGATTTGGGGTTGCTCTGGAAAACTCATTT 

GCACCACTACTGTGCCTTGGAACTCTAGTTGGAGTAATAAATCTCTGACTGAG 

ATTTGGGATAATATGACCTGGATGGAGTGGGAAAGAGAAATTGGCAATTATA 

CAGGCTTAATATACAATTTAATTGAAATAGCACAAAACCAGCAAGAAAAGAA 

TGAACAAGAATTATTGGAATTAGACAAGTGGGCAAGTTTGTGGAATTGGTTT 

GATATAACAAACTGGCTGTGGTATATA 



FIG. 39 (SEQ ID N0:52) 



I 



gpl60wtUS4 

ACAACAGTCTTGTGGGTCACAGTCTATTATGGGGTACCTGTGTGGAAAGAAG 

CAACCACCACTCTGTTTTGTGCATCAGATGCTAAAGCATACAAAGCAGAGGC 

ACATAACGTCTGGGCTACACATGCCTGTGTACCCACAGACCCCAACCCACAG 

GAAGTAAATTTAACAAATGTGACAGAAAATTTTAACATGTGGAAAAATAACA 

TGGTGGAACAGATGCATGAGGATATAATCAGTTTATGGGATCAAAGCCTAAA 

GCCATGTGTAAAATTAACCCCACTCTGTGTTACTTTAAATTGTACTGATAAGT 

TGACAGGTAGTACTAATGGCACAAATAGTACTAGTGGCACTAATAGTACTAG 

TGGCACTAATAGTACTAGTACTAATAGTACTGATAGTTGGGAAAAGATGCCA 

GAAGGAGAAATAAAAAACTGCTCTTTCAATATCACCACAAGTGTAAGAGATA 

AAGTGCAGAAAGAATATTCTCTCTTCTATAAACTTGATGTAGTACCAATAGAT 

AATGATAATGCTAGCTATAGATTGATAAATTGTAATACCTCAGTCATTACACA 

AGCCTGTCCAAAGGTATCTTTTGAACCAATTCCCATACATTATTGTGCCCCGG 

CTGGTTTTGCGATTCTAAAGTGTAAAGATAAGAAGTTCAATGGAACAGGACC 

ATGTAAAAATGTCAGCACAGTACAATGCACACATGGAATTAGACCAGTAGTA 

TCAACTCAACTGCTGTTAAATGGCAGTCTAGCAGAAGAAGAGATAGTACTTA 

GATCTGAAAATTTCACAGACAATGCTAAAACCATAATAGTACAGCTGAATGA 

ATCTGTAGAAATTAATTGTATAAGACCCAACAATAATACAAGAAAAAGTATA 

CATATAGGACCAGGGAGAGCATTTTATGCAACAGGTGATATAATAGGAGACA 

TAAGACAAGCACATTGTAACATTAGTAAAGCAAACTGGACTAACACTTTAGA 

ACAGATAGTTGAAAAATTAAGAGAACAATTTGGGAATAATAAAACAATAATC 

TTTAATTCATCCTCAGGAGGGGACCCAGAAATTGTATTTCACAGTTTTAATTG 

TGGAGGGGAATTTTTCTATTGTAATACATCACAACTATTTAATAGTACCTGGA 

ATATTACTGAAGAGGTAAATAAGACTAAAGAAAATGACACTATCATACTCCC 

ATGCAGAATAAGACAAATTATAAACATGTGGCAAGAAGTAGGAAAAGCAAT 

GTATGCCCCTCCCATCAGAGGACAAATTAAATGTTCATCAAATATTACAGGG 

CTGCTATTAACTAGAGATGGTGGTACTAACAATAATAGGACGAACGACACCG 

AGACCTTCAGACCTGGGGGAGGAAACATGAAGGACAATTGGAGAAGTGAAT 

TATATAAATATAAAGTAGTAAGAATTGAACCATTAGGAGTAGCACCCACCCA 

GGCAAAGAGAAGAGTGGTGCAAAGAGAGAAAAGAGCAGTGGGACTAGGAG 

CTTTGTTCATTGGGTTCTTGGGAGCAGCAGGAAGCACTATGGGCGCAGCGTC 

AGTGACGCTGACGGTACAGGCCAGACAATTATTGTCTGGTATAGTGCAACAG 

CAGAACAATTTGCTGAGAGCTATTGAGGCGCAACAGCATCTGTTGCAACTCA 

CGGTCTGGGGCATCAAACAGCTCCAGGCAAGAATCCTGGCTGTGGAAAGATA 

CCTAAAGGATCAACAGCTCCTAGGGATTTGGGGTTGCTCTGGAAAACTCATTT 

GCACCACTACTGTGCCTTGGAACTCTAGTTGGAGTAATAAATCTCTGACTGAG 

ATTTGGGATAATATGACCTGGATGGAGTGGGAAAGAGAAATTGGCAATTATA 

CAGGCTTAATATACAATTTAATTGAAATAGCACAAAACCAGCAAGAAAAGAA 

TGAACAAGAATTATTGGAATTAGACAAGTGGGCAAGTTTGTGGAATTGGTTT 

GATATAACAAACTGGCTGTGGTATATAAGAATATTCATAATGATAGTAGGAG 

GCTTGATAGGTTTAAGAATAGTTTTTGCTGTACTTTCTATAGTGAATAGAGTT 

AGGCAGGGATACTCACCAATATCATTGCAGACCCGCCTCCCAGCTCAGAGGG 



FIG. 40 (SEQ ID N0:53) 



GACCCGACAGGCCCGAAGGAATCGAAGAAGAAGGTGGAGAGAGAGACAGA 

GACAGATCCAATCGATTAGTGCATGGATTATTGGCACTCATCTGGGACGATCT 

GCGGAGCCTGTGCCTCTTCAGCTACCACCGCTTGAGAGACTTACTCTTGATTG 

TAGCGAGGATTGTGGAACTTCTGGGACGCAGGGGGTGGGAAGCCCTCAAGTA 

TTGGTGGAATCTCCTGCAGTATTGGAGTCAGGAGCTAAAGAGTAGTGCTGTT 

AGTTTGTTTAATGCCACAGCAATAGCAGTAGCTGAAGGGACAGATAGGATTA 

TAGAAATAGTACAAAGAATTTTTAGAGCTGTAATTCACATACCTAGAAGAAT 

AAGACAGGGCTTGGAGAGGGCTTTACTATAA 



FIG. 40 CONT'D (SEQ ID N0:53) 



gpl20.modUS4 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAACGGCACCAACAGCACCAGCGGCAC 

CAACAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACCGACAGCTGGGAGAAGATG 

CCCGAGGGCGAGATCAAGAACTGCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 

GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACGCCAGCT 

ACCGCCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGC 

CCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGT 

TCAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTC 

CGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCA 

ACTGCATCCGCCCCAACAACAACACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCT 

ACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAAC 

TGGACCAACACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGAC 

CATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGG 

CGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGA 

GGTGAACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCA 

ACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGC 

AGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAA 

CGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTGT 

ACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGC 

GTGGTGCAGCGCGAGAAGCGCTAAGATATCGGATCCTCTAGA 



FIG. 4 1 (SEQ ID NO; 54) 



gpl20.mod.US4.dell28-194 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGG 
AGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCG 
TGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTAC 
AAGGCCGAGGCCCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCC 
CCAGGAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGG 
TGGAGCAGATGCATGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTG 
AAGCTGACCCCCCTGTGCGTGGGGGCAGGGAACTGCGAGACCAGCGTGATCACCCAGGC 
CTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCG 
CCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGC 
ACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGG 
CAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGA 
CCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAAC 
ACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCAT 
CGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTCG 
AGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAAC 
AGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGCGGCGAGTT 
CTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAGGTGA 
ACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAAC 
ATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTG 
CAGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCA 
CCAACGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGC 
GAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGC 
CAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCTAAGATATCGGATCCTCTAGA 



FIG. 42 (SEQ ID N0:55) 



gpl40.modUS4 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAACGGCACCAACAGCACCAGCGGCAC 

CAACAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACCGACAGCTGGGAGAAGATG 

CCCGAGGGCGAGATCAAGAACTGCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 

GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACGCCAGCT 

ACCGCCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGC 

CCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGT 

TCAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTC 

CGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCA 

ACTGCATCCGCCCCAACAACAACACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCT 

ACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAAC 

TGGACCAACACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGAC 

CATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGG 

CGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGA 

GGTGAACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCA 

ACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGC 

AGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAA 

CGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTGT 

ACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGC 

GTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGCTTCCTGGGCGCC 

GCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAG 

CGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGC 

AGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCGCTACCTG 

AAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCACCGT 

GCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAACATGACCTGGA 

TGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTACAACCTGATCGAGATCGCC 

CAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGT 

GGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCTAAGATATCGGATCCTCTAGA 



FIG. 43 (SEQ ID N0:56) 



gpl40.mut.modUS4 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAACGGCACCAACAGCACCAGCGGCAC 

CAACAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACCGACAGCTGGGAGAAGATG 

CCCGAGGGCGAGATCAAGAACTGCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 

GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACGCCAGCT 

ACCGCCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGC 

CCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGT 

TCAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTC 

CGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCA 

ACTGCATCCGCCCCAACAACAACACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCT 

ACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAAC 

TGGACCAACACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGAC 

CATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGG 

CGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGA 

GGTGAACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCA 

ACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGC 

AGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAA 

CGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTGT 

ACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGC 

GTGGTGCAGCGCGAGAAGAGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGC1TCCTGGGCGCC 

GCCGGGAGCA(XATGGGCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAG 

CGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGC 

AGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCGCTACCTG 

AAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCACCGT 

GCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAACATGACCTGGA 

TGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTACAACCTGATCGAGATCGCC 

CAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTCKJAGCTGGACAAGTGGGCCAGCCTGT 

GGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCTAAGATATCGGATCCTCTAGA 



FIG. 44 (SEQ ID N0:57) 



gpl40.TM.modUS4 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAACGGCACCAACAGCACCAGCGGCAC 

CAACAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACCGACAGCTGGGAGAAGATG 

CCCGAGGGCGAGATCAAGAACTGCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 

GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACGCCAGCT 

ACCGCCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGC 

CCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGT 

TCAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTC 

CGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCA 

ACTGCATCCGCCCCAACAACAACACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCT 

ACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAAC 

TGGACCAACACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGAC 

CATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGG 

CGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGA 

GGTGAACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCA 

ACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGC 

AGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAA 

CGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTGT 

ACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGC 

GTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGCTTCCTGGGCGCC 

GCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAG 

CGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGC 

AGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCGCTACCTG 

AAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCACCGT 

GCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAACATGACCTGGA 

TGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTACAACCTGATCGAGATCGCC 

CAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGT 

GGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCCGCATCTTCATCATGATCGTGGGCG 

GCCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCATCGTGTAAGATATCGGATCCTCTA 

GA 



FIG. 45 (SEQ ID N0:58) 



Gpl40modUS4.DVlV2 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGC 

TGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACC 

GTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCG 

CCAGCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTGGGCCACCCA 

CGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACCTGACCAACGTG 

ACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGCCGGCC 

AGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCC 

CGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGC 

CCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGG 

TGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGCT 

GCGCTCCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAC 

GAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACGCGTAAGAGCA 

TCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGA 

CATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTC 

GAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATC 

ATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCA 

ACTGCGGCGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAACAGCAC 

CTGGAACATCACCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCAT 

CCTGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAGGAGGTGGGCAAG 

GCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGCAGCAGCAATATTA 

CCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAACGA 

CACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGC 

GAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCA 

CCCAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGG 

GCGCCCTGTTCATCGGCTTCCTGGGCGCCGCCGGGAGCACCATGGGCGCCGC 

CTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAG 

CAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGC 

TGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCG 

CTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTG 

ATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGA 

CCGAGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGGCA 

ACTACACCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACCAGCAGGA 

GAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAA 

CTGGTTCGACATCACCAACTGGCTGTGGTACATCTAAGATATCGGATCCTCTA 

GA 



FIG. 46 (SEQ ID N0:59) 



Gpl40modUS4.DV2 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGC 

TGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACC 

GTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCG 

CCAGCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTGGGCCACCCA 

CGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACCTGACCAACGTG 

ACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCC 

CCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAACGG 

CACCAACAGCACCAGCGGCACCAACAGCACCAGCGGCACCAACAGCACCAG 

CACCAACAGCACCGACAGCTGGGAGAAGATGCCCGAGGGCGAGATCAAGAA 

CTGCAGCTTCAACATCGGCGCCGGCCGCCTGATCAACTGCAACACCAGCGTG 

ATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACT 

GCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGG 

CACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGC 

CCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGA 

TCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCA 

GCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACGCGT 

AAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCA 

TCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAA 

CACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAA 

GACCATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCAC 

AGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAA 

CAGCACCTGGAACATCACCGAGGAGGTGAACAAGACCAAGGAGAACGACAC 

CATCATCCTGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAGGAGGTG 

GGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGCAGCAGCA 

ATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCAC 

CAACGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTG 

GCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTG 

GCCCCCACCCAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTG 

GGCCTGGGCGCCCTGTTCATCGGCTTCCTGGGCGCCGCCGGGAGCACCATGG 

GCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCAT 

CGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTG 

CTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCG 

TGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGG 

CAAGCTGATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAG 

AGCCTGACCGAGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAG 

ATCGGCAACTACACCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACC 

AGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCC 

TGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCTAAGATATCGG 

ATCCTCTAGA 



FIG. 47 (SEQ ID N0:60) 



Gpl40modmutUS4.DVlV2 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGC 

TGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACC 

GTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCG 

CCAGCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTGGGCCACCC 

ACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACCTGACCAACGT 

GACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGA 

GGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGCCGGC 

CAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCC 

CCGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGG 

CCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCGTG 

GTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGC 

TGCGCTCCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAA 

CGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACGCGTAAGAGC 

ATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCG 

ACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCT 

CGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCAT 

CATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTC 

AACTGCGGCGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAACAGCA 

CCTGGAACATCACCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCA 

TCCTGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAGGAGGTGGGCAA 

GGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGCAGCAGCAATATT 

ACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAACG 

ACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCA 

GCGAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCC 

CACCCAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGAGCGCCGTGGGCCT 

GGGCGCCCTGTTCATCGGCTTCCTGGGCGCCGCCGGGAGCACCATGGGCGCC 

GCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGC 

AGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCA 

GCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAG 

CGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGC 

TGATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCT 

GACCGAGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGG 

CAACTACACCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACCAGCAG 

GAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGG 

AACTGGTTCGACATCACCAACTGGCTGTGGTACATCTAAGATATCGGATCCTC 

TAGA 



FIG. 48 (SEQ/D N0:61) 



gpl40.mod.US4.de!128-194 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGG 
AGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCG 
TGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTAC 
AAGGCCGAGGCCCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCC 
CCAGGAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGG 
TGGAGCAGATGCATGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTG 
AAGCTGACCCCCCTGTGCGTGGGGGCAGGGAACTGCGAGACCAGCGTGATCACCCAGGC 
CTGGCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCG 
CCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGC 
ACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGG 
CAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGA 
CCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAAC 
ACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCAT 
CGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTCG 
AGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAAC 
AGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGCGGCGAGTT 
CTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAGGTGA 
ACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAAC 
ATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTG 
CAGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCA 
CCAACGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGC 
GAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGC 
CAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCG 
GCTTCCTGGGCGCCGCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAG 
GCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGA 
GGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCA 
TCCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGC 
GGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCT 
GACCGAGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTACA 
CCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACCAGCAGGAGAAGAACGAGCAG 
GAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTG 
GCTGTGGTACATCTAAGATATCGGATCCTCTAGA 



FIG. 49 (SEQ ID N0:62) 



gpI40.mut.mod.US4.dcll 28-1 94 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGG 
AGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCG 
TGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTAC 
AAGGCCGAGGCCCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCC 
CCAGGAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGG 
TGGAGCAGATGCATGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTG 
AAGCTGACCCCCCTGTGCGTGGGGGCAGGGAACTGCGAGACCAGCGTGATCACCCAGGC 
CTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCG 
CCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGC 
ACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGG 
CAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGA 
CCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAAC 
ACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCAT 
CGGCGACATCCGCCAGGCCCACTGCT^CATCAGCAAGGCCAACTGGACCAACACCCTCG 
AGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAAC 
AGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGCGGCGAGTT 
CTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAGGTGA 
ACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAAC 
ATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTG 
CAGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCA 
CCAACGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGC 
GAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGC 
CAAGCGCCGCGTGGTGCAGCGCGAGAAGAGCGCCGTGGGCCTGGGCGCCCTGTTCATCG 
GCTTCCTGGGCGCCGCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAG 
GCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGA 
GGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCA 
TCCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGC 
GGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCT 
GACCGAGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTACA 
CCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACCAGCAGGAGAAGAACGAGCAG 
GAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTG 
GCTGTGGTACATCTAAGATATCGGATCCTCTAGA 



FIG. 50 (SEQ ID NO:63) 



gpl60.modUS4 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTQTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAACGGCACCAACAGCACCAGCGGCAC 

CAACAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACCGACAGCTGGGAGAAGATG 

CCCGAGGGCGAGATCAAGAACTGCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 

GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACGCCAGCT 

ACCGCCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGC 

CCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGT 

TCAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTC 

CGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCA 

ACTGCATCCGCCCCAACAACAACACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCT 

ACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAAC 

TGGACCAACACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGAC 

CATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGG 

CGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGA 

GGTGAACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCA 

ACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGC 

AGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAA 

CGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTGT 

ACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGC 

GTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGCTTCCTGGGCGCC 

GCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAG 

CGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGC 

AGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCGCTACCTG 

AAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCACCGT 

GCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAACATGACCTGGA 

TGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTACAACCTGATCGAGATCGCC 

CAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGT 

GGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCCGCATCTTCATCATGATCGTGGGCG 

GCCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCT 

ACAGCCCCATCAGCCTGCAGACCCGCCTGCCCGCCCAGCGCGGCCCCGACCGCCCCGAGGGC 

ATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAACCGCCTGGTGCACGGCCTGCT 

GGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCT 

GCTGCTGATCGTGGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGT 

ACTGGTGGAACCTGCTGCAGTACTGGAGCCAGGAGCTGAAGAGCAGCGCCGTGAGCCTGTTC 

AACGCCACCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCAT 

CTTCCGCGCCGTGATCCACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTA 

AGATATCGGATCCTCTAGA 



FIG. 51 (SEQ/D NO: 64) 



gpl60.modUS4.delVl 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGAACTGCACCGACAAGCTGGGCGCCGGCGGCGAGATCAAGAACTGCAGCTTCAACAT 

CACCACCAGCGTGCGCGACAAGGTGCAGAAGKJAGTACAGCCTGTTCTACAAGCTGGAGGTGG 

TGCCCATCGACAACGACAACGCCAGCTACCGCCTGATCAACTGCAACACCAGCGTGATCACCC 

AGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCG 

CCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGCACC 

GTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTG 

GCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGACCATCATCGT 

GCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACGCGTAAGAGCA 

TCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGG 

CCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTCGAGCAGATCGTGGAGAAGCTG 

CGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGA 

GATCGTGTTCCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTT 

CAACAGCACCTGGAACATCACCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCATCC 

TGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCC 

CCCCCCATCCGCGGCCAGATCAAGTGCAGCAGCAATATTACCGGCCTGCTGCTGACCCGCGAC 

GGCGGCACCAACAACAACCGCACCAACGACACCGAGACCTTCCGCCCCGGCGGCGGCAACAT 

GAAGGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCG 

TGGCCCCCACCCAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGC 

GCCCTGTTCATCGGCTTCCTGGGCGCCGCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTG 

ACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGC 

CATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCC 

GCATCCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGC 

GGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGAC 

CGAGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCC 

TGATCTACAACCTGATCGAGATCGCCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTG 

GAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTACATC 

CGCATCITCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGC 

ATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCATCAGCCTGCAGACCCGCCTGCCCGCCCAG 

CGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCA 

GCAACCGCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGT 

TCAGCTACCACCGCCTGCGCGACCTGCTGCTGATCGTGGCCCGCATCGTGGAGCTGCTGGGCC 

GCCGCGGCTGGGAGGCCCTGAAGTACTGGTGGAACCTGCTGCAGTACTGGAGCCAGGAGCTG 

AAGAGCAGCGCCGTGAGCCTGTTCAACGCCACCGCCATCGCCGTGGCCGAGGGCACCGACCG 

CATCATCGAGATCGTGCAGCGCATCTTCCGCGCCGTGATCCACATCCCCCGCCGCATCCGCCA 

GGGCCTGGAGCGCGCCCTGCTGTAAGATATCGGATCCTCTAGA 



FIG. 52 (SEQ ID N0:65) 



gpl60.mod.US4.deIV2 

GAATTCGCC&CCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGG 
AGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCG 
TGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTAC 
AAGGCCGAGGCCCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCC 
CCAGGAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGG 
TGGAGCAGATGCATGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTG 
AAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAA 
CGGCACCAACAGCACCAGCGGCACCAACAGCACCAGCGGCACCAACAGCACCAGCACCA 
ACAGCACCGACAGCTGGGAGAAGATGCCCGAGGGCGAGATCAAGAACTGCAGCTTCAAC 
ATCGGCGCCGGCCGCCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAA 
GGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGA 
AGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAG 
TGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGC 
CGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGACCATCATCG 
TGCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACGCGTAAG 
AGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACAT 
CCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTCGAGCAGATCG 
TGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAACAGCAGCAGC 
GGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTG 
CAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAGGTGAACAAGACCA 
AGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAG 
GAGGTGGGCAAGGCCATGTACGCCCCCCCGATCCGCGGCCAGATCAAGTGCAGCAGCAA 
TATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAACGACA 
CCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTGTAC 
AAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCG 
CGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGCTTCCTGG 
GCGCCGCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAG 
CTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 
GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCG 
TGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTG 
ATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGAT 
CTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGA 
TCTACAACCTGATCGAGATCGCCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTG 
GAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTA 
CATCCGCATCTTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCGTGTTCGCCG 
TGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCATCAGCCTGCAGACCCGC 
CTGCCCGCCCAGCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCG 
CGACCGCGACCGCAGCAACCGCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACC 
TGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGCTGCTGATCGTGGCC 
CGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGTGGAACCT 
GCTGCAGTACTGGAGCCAGGAGCTGAAGAGCAGCGCCGTGAGCCTGTTCAACGCCACCG 
CCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCATCTTCCGC 
GCCGTGATCCACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAAGA 
TATCGGATCCTCTAGA 

FIG. 53 (SEQ ID NO:66) 



gpl60.modUS4delVl/2 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGCCGGCCAGGCCTGCCC 

CAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAA 

GTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCA 

CCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAG 

GAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAA 

CGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACGCGTAAGAGCATCCACATCG 

GCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCA 

ACATCAGCAAGGCCAACTGGACCAACACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAG 

TTCGGCAACAACAAGACCATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTT 

CCACAGCTTCAACTGCGGCGGCGAGTTCTT 

CTGGAACATCACCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCC 

GCATCCGCCAGATCATCAACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATC 

CGCGGCCAGATCAAGTGCAGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCAC 

CAACAACAACCGCACCAACGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACA 

ACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCC 

ACCCAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTT 

CATCGGCTTCCTGGGCGCCGCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCA 

GGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGG 

CCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTG 

GCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCT 

GATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGATCT 

GGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTAC 

AACCTGATCGAGATCGCCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGG 

ACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCCGCATCT 

TCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCATCGTGA 

ACCGCGTGCGCCAGGGCTACAGCCCCATCAGCCTGCAGACCCGCCTGCCCGCCCAGCGCGGC 

CCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAACC 

GCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCT 

ACCACCGCCTGCGCGACCTGCTGCTGATCGTGGCCCGCATCGTGGAGCTGCTGGGCCGCCGCG 

GCTGGGAGGCCCTGAAGTACTGGTGGAACCTGCTGCAGTACTGGAGCCAGGAGCTGAAGAGC 

AGCGCCGTGAGCCTGTTCAACGCCACCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATC 

GAGATCGTGCAGCGCATCTTCCGCGCCGTGATCCACATCCCCCGCCGCATCCGCCAGGGCCTG 

GAGCGCGCCCTGCTGTAAGATATCGGATCCTCTAGA 



FIG. 54 (SEQ ID NO:67) 



gpl60.modUS4 del 128-194 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

GGGGCAGGGAACTGCGAGACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCC 

CATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAG1T 

CAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCG 

TGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCC 

GAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAA 

CTGCATCCGCCCCAACAACAACACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTA 

CGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACT 

GGACCAACACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACC 

ATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGC 

GGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAG 

GTGAACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAA 

CATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGCA 

GCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAAC 

GACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTGTA 

CAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGCG 

TGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGCTTCCTGGGCGCCG 

CCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGC 

GGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCA 

GCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCGCTACCTGA 

AGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCACCGTG 

CCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAACATGACCTGGAT 

GGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTACAACCTGATCGAGATCGCCC 

AGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTG 

GAACTGGTTCGACATCACCAACTGGCTGTGGTACATCCGCATCTTCATCATGATCGTGGGCGG 

CCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTA 

CAGCCCCATCAGCCTGCAGACCCGCCTGCCCGCCCAGCGCGGCCCCGACCGCCCCGAGGGCA 

TCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAACCGCCTGGTGCACGGCCTGCTG 

GCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTG 

CTGCTGATCGTGGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTAC 

TGGTGGAACCTGCTGCAGTACTGGAGCCAGGAGCTGAAGAGCAGCGCCGTGAGCCTGTTCAA 

CGCCACCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCATCTT 

CCGCGCCGTGATCCACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAAGA 
TATCGGATCCTCTAGA 



FIG. 55 (SEQ ID N0:68) 



Env_US4_C4wt 

GACACTATCATACTCCCATGCAGAATAAGACAAATTATAAACATGTGGCAAGAAGTAGG 
AAAAGCAATGTATGCCCCTCCCATCAGAGGACAAATTAAATGTTCATCAAATATTACAG 
GGCTGCTATTAACTAGAGATGGTGGT 



FIG. 56 (SEQ ID N0:69) 



Env_SF162_C4wt 

GGAACTATCACACTCCCATGCAGAATAAAACAAATTATAAACAGGTGGCAGGAAGTAGG 
AAAAGCAATGTATGCCCCTCCCATCAGAGGACAAATTAGATGCTCATCAAATATTACAG 
GACTGCTATTAACAAGAGATGGTGGT 



FIG. 57 (SEQ ID NO: 70) 



Env_US4_C4mod 

GACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAGGAGGTGGG 
CAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGCAGCAGCAACATCACCG 
GCCTGCTGCTGACCCGCGACGGCGGC 



FIG. 58 (SEQID N0:71) 



Env_SF162_C4mod 

GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGG 
CAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCG 
GCCTGCTGCTGACCCGCGACGGCGGC 



FIG. 59 (SEQ ID NO: 7 2) 
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FIG. 60 



gpl60mod.us4.gag.modSF2 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGA 
GCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTG 
CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAG 
GCCGAGGCCCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 
GAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
CAGATGCATGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTG 
ACCCCCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAACGGCACC 
AACAGCACCAGCGGCACCAACAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACC 
GACAGCTGGGAGAAGATGCCCGAGGGCGAGATCAAGAACTGCAGCTTCAACATCACCACC 
AGCGTGCGCGACAAGGTGCAGAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCC 
ATCGACAACGACAACGCCAGCTACCGCCTGATCAACTGCAACACCAGCGTGATCACCCAG 
GCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTC 
GCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGC 
ACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 
AGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGACC 
ATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACG 
CGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGC 
GACATCCGCCAGGCCCACTGCAACATCAGGAAGGCCAACTGGACCAACACCCTCGAGCAG 
ATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAACAGCAGC 
AGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGCGGCGAGTTCTTCTAC 
TGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAGGTGAACAAGACC 
AAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAG 
GAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGCAGCAGCAAT 
ATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAACGACACC 
GAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTGTACAAG 
TACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGCGTG 
GTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGCTTCCTGGGCGCC 
GCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTG 
AGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTG 
CTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCGC 
TACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACC 
ACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAAC 
ATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTACAACCTG 
ATCGAGATCGCCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAG 
TGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCCGCATCTTC 
ATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCATCGTG 
AACCGCGTGCGCCAGGGCTACAGCCCCATCAGCCTGCAGACCCGCCTGCCCGCCCAGCGC 
GGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGC 
AACCGCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTG 
TTCAGCTACCACCGCCTGCGCGACCTGCTGCTGATCGTGGCCCGCATCGTGGAGCTGCTG 
GGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGTGGAACCTGCTGCAGTACTGGAGCCAG 
GAGCTGAAGAGCAGCGCCGTGAGCCTGTTCAACGCCACCGCCATCGCCGTGGCCGAGGGC 
ACCGACCGCATCATCGAGATCGTGCAGCGCATCTTCCGCGCCGTGATCCACATCCCCCGC 
CGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAAGATATCGGATCCTCTAGAGAATTC 

FIG. 61 (SEQ ID NO: 73) 



CGCCCCCCCCCCCCCCCCCCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAGCCGC 
TTGGAATAAGGCCGGTGTGCGTTTGTCTATATGTTATTTTCCACCATATTGCCGTCTTTT 
GGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAGGGGTCTT 
TCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTTCCTCTG 
GAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAACCCCCCA 
CCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCAAAGGCG 
GCACAACCCCAGTGCCACGTTGTGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCTCC 
TCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAAGGTACCCCATTGTATGGGATCT 
GATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAAACGTCTA 
GGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATAATACCATGGGCGC 
CCGCGCCAGCGTGCTGAGCGGCGGCGAGCTGGACAAGTGGGAGAAGATCCGCCTGCGCCC 
CGGCGGCAAGAAGAAGTACAAGCTGAAGCACATCGTGTGGGCCAGCCGCGAGCTGGAGCG 
CTTCGCCGTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCCGCCAGATCCTGGGCCA 
GCTGCAGCCCAGCCTGCAGACCGGCAGCGAGGAGCTGCGCAGCCTGTACAACACCGTGGC 
CACCCTGTACTGCGTGCACCAGCGCATCGACGTCAAGGACACCAAGGAGGCCCTGGAGAA 
GATCGAGGAGGAGCAGAACAAGTCCAAGAAGAAGGCCCAGCAGGCCGCCGCCGCCGCCGG 
CACCGGCAACAGCAGCCAGGTGAGCCAGAACTACCCCATCGTGCAGAACCTGCAGGGCCA 
GATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGGTGGAGGA 
GAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCAGCGCCCTGAGCGAGGGCGCCACCCC 
CCAGGACCTGAACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCT 
GAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGG 
CCCCATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCAGCGACATCGCCGGCACCACCAG 
CACCCTGCAGGAGCAGATCGGCTGGATGACCAACAACCCCCCCATCCCCGTGGGCGAGAT 
CTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTGCGGATGTACAGCCCCACCAG 
CATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGAGTACGTGGACCGCTTCTA 
CAAGACCCTGCGCGCTGAGCAGGCCAGCCAGGACGTGAAGAACTGGATGACCGAGACCCT 
GCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGAAGGCTCTCGGCCCCGCGGC 
CACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCGGCCACAAGGCCCG 
CGTGCTGGCCGAGGCGATGAGCCAGGTGACGAACCCGGCGACCATCATGATGCAGCGCGG 
CAACTTCCGCAACCAGCGGAAGACCGTCAAGTGCTTCAACTGCGGCAAGGAGGGCCACAC 
CGCCAGGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGCGCTGCGGCCGCGAGGGCCA 
CCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCCAGCTA 
CAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCCCGAGGA 
GAGCTTCCGCTTCGGCGAGGAGAAGACCACCCCCAGCCAGAAGCAGGAGCCCATCGACAA 
GGAGCTGTACCCCCTGACCAGCCTGCGCAGCCTGTTCGGCAACGACCCCAGCAGCCAGTA 
AGAATTCAGACTCGAGCAAGTCTAGA 



FIG. 61 (CONT'D.) (SEQ ID NO: 73) 



gpl60mod.SF162.gag.modSF2 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGG 

AGCAGTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCG 

TGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTAC 

GACACCGAGGTGCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCC 

CCAGGAGATCGTGCTGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGG 

TGGAGCAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTG 

AAGCTGACCCCCCTGTGCGTGACCCTGCACTGCACCAACCTGAAGAACGCCACCAACAC 

CAAGAGCAGCAACTGGAAGGAGATGGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGG 

TGACCACCAGCATCCGCAACAAGATGCAGAAGGAGTACGCCCTGTTCTACAAGCTGGAC 

GTGGTGCCCATCGACAACGACAACACCAGCTACAAGCTGATCAACTGCAACACCAGCGT 

GATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCC 

CCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGC 

ACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCT 

GCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCG 

ACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGC 

CCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCAC 

CGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGA 

ACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGACCATC 

GTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCGG 

CGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCA 

TCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATC 

AACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCG 

CTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCA 

ACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAG 

CTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAA 

GCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCT 

TCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCC 

CGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGC 

CCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGC 

TGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGC 

AAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGA 

CCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCA 

ACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAG 

CTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCT 

GTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGT 

TCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAG 

ACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 

CGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGG 

ACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATC 

GCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGG 

CAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACG 

CCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATC 

GGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCT 



FIG. 62 (SEQ ID NO: 7 4) 



GTAACTCGAGCAAGTCTAGAGAATTCCGCCCCCCCCCCCCCCCCCCCTCTCCCTCCCCC 

CCCCCTAACGTTACTGGCCGAAGCCGCTTGGAATAAGGCCGGTGTGCGTTTGTCTATAT 

GTTATTTTCCACCATATTGCCGTCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTG 

TCTTCTTGACGAGCATTCCTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTG 

TTGAATGTCGTGAAGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGT 

AGCGACCCTTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAA 

AGCCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTGTGAGT 

TGGATAGTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAACAAGGGGCTGAA 

GGATGCCCAGAAGGTACCCCATTGTATGGGATCTGATCTGGGGCCTCGGTGCACATGCT 

TTACATGTGTTTAGTCGAGGTTAAAAAAACGTCTAGGCCCCCCGAACCACGGGGACGTG 

GTTTTCCTTTGAAAAACACGATAATACCATGGGCGCCCGCGCCAGCGTGCTGAGCGGCG 

GCGAGCTGGACAAGTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGAAGTACAAG 

CTGAAGCACATCGTGTGGGCCAGCCGCGAGCTGGAGCGCTTCGCCGTGAACCCCGGCCT 

GCTGGAGACCAGCGAGGGCTGCCGCCAGATCCTGGGCCAGCTGCAGCCCAGCCTGCAGA 

CCGGCAGCGAGGAGCTGCGCAGCCTGTACAACACCGTGGCCACCCTGTACTGCGTGCAC 

CAGCGCATCGACGTCAAGGACACCAAGGAGGCCCTGGAGAAGATCGAGGAGGAGCAGAA 

CAAGTCCAAGAAGAAGGCCCAGCAGGCCGCCGCCGCCGCCGGCACCGGCAACAGCAGCC 

AGGTGAGCCAGAACTACCCCATCGTGCAGAACCTGCAGGGCCAGATGGTGCACCAGGCC 

ATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGGTGGAGGAGAAGGCCTTCAGCCC 

CGAGGTGATCCCCATGTTCAGCGCCCTGAGCGAGGGCGCCACCCCCCAGGACCTGAACA 

CGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCTGAAGGAGACCATC 

AACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCC 

CGGCCAGATGCGCGAGCCCCGCGGCAGCGACATCGCCGGCACCACCAGCACCCTGCAGG 

AGCAGATCGGCTGGATGACCAACAACCCCCCCATCCCCGTGGGCGAGATCTACAAGCGG 

TGGATCATCCTGGGCCTGAACAAGATCGTGCGGATGTACAGCCCCACCAGCATCCTGGA 

CATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTACAAGACCC 

TGCGCGCTGAGCAGGCCAGCCAGGACGTGAAGAACTGGATGACCGAGACCCTGCTGGTG 

CAGAACGCCAACCCCGACTGCAAGACCATCCTGAAGGCTCTCGGCCCCGCGGCCACCCT 

GGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCGGCCACAAGGCCCGCGTGC 

TGGCCGAGGCGATGAGCCAGGTGACGAACCCGGCGACCATCATGATGCAGCGCGGCAAC 

TTCCGCAACCAGCGGAAGACCGTCAAGTGCTTCAACTGCGGCAAGGAGGGCCACACCGC 

CAGGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGCGCTGCGGCCGCGAGGGCCACC 

AGATGAAGGACTGCACCGAGCGCCAGGCGAACTTCCTGGGCAAGATCTGGCCCAGCTAC 

AAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCCCGAGGA 

GAGCTTCCGCTTCGGCGAGGAGAAGACCACCCCCAGCCAGAAGCAGGAGCCCATCGACA 

AGGAGCTGTACCCCCTGACCAGCCTGCGCAGCCTGTTCGGCAACGACCCCAGCAGCCAG 

TAAGAATTCAGACTCGAGCAAGTCTAGA 



FIG. 62 (CONT'D.) (SEO ID NO: 7 4) 



gpl60modUS4.delVl/V2.gag.modSF2 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGA 
GCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTG 
CCCGTGTGGAAGGAGGCCACCT^CCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAG 
GCCGAGGCCCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 
GAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
CAGATGCATGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGCC 
GGCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCC 
GGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAAC 
GTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTG 
AACGGCAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCC 
AAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAAC 
AACACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATC 
ATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTC 
GAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAAC 
AGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGCGGCGAGTTC 
TTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAGGTGAAC 
AAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAACATG 
TGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGCAGC 
AGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAAC 
GACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTG 
TACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGC 
CGCGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGCTTCCTG 
GGCGCCGCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAG 
CTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAG 
CACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTG 
GAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATC 
TGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGG 
GACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTAC 
AACCTGATCGAGATCGCCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTG 
GACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCCGC 
ATCTTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGC 
ATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCATCAGCCTGCAGACCCGCCTGCCCGCC 
CAGCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGAC 
CGCAGCAACCGCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTG 
TGCCTGTTCAGCTACCACCGCCTGCGCGACCTGCTGCTGATCGTGGCCCGCATCGTGGAG 
CTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGTGGAACCTGCTGCAGTACTGG 
AGCCAGGAGCTGAAGAGCAGCGCCGTGAGCCTGTTCAACGCCACCGCCATCGCCGTGGCC 
GAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCATCTTCCGCGCCGTGATCCACATC 
CCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAAGATATCGGATCCTCTAGA 
GAATTCCGCCCCCCCCCCCCCCCCCCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGA 
AGCCGCTTGGAATAAGGCCGGTGTGCGTTTGTCTATATGTTATTTTCCACCATATTGCCG 
TCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAGG 
GGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTT 



FIG. 63 (SEQ ID NO: 7 5) 



CCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAAC 
CCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCA 
AAGGCGGCACAACCCCAGTGCCACGTTGTGAGTTGGATAGTTGTGGAAAGAGTCAAATGG 
CTCTCCTCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAAGGTACCCCATTGTATG 
GGATCTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAAA 
CGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATAATACCAT 
GGGCGCCCGCGCCAGCGTGCTGAGCGGCGGCGAGCTGGACAAGTGGGAGAAGATCCGCCT 
GCGCCCCGGCGGCAAGAAGAAGTACAAGCTGAAGCACATCGTGTGGGCCAGCCGCGAGCT 
GGAGCGCTTCGCCGTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCCGCCAGATCCT 
GGGCCAGCTGCAGCCCAGCCTGCAGACCGGCAGCGAGGAGCTGCGCAGCCTGTACAACAC 
CGTGGCCACCCTGTACTGCGTGCACCAGCGCATCGACGTCAAGGACACCAAGGAGGCCCT 
GGAGAAGATCGAGGAGGAGCAGAACAAGTCCAAGAAGAAGGCCCAGCAGGCCGCCGCCGC 
CGCCGGCACCGGCAACAGCAGCCAGGTGAGCCAGAACTACCCCATCGTGCAGAACCTGCA 
GGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGGT 
GGAGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCAGCGCCCTGAGCGAGGGCGC 
CACCCCCCAGGACCTGAACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCA 
GATGCTGAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCA 
CGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCAGCGACATCGCCGGCAC 
CACCAGCACCCTGCAGGAGCAGATCGGCTGGATGACCAACAACCCCCCCATCCCCGTGGG 
CGAGATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTGCGGATGTACAGCCC 
CACCAGCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCG 
CTTCTACAAGACCCTGCGCGCTGAGCAGGCCAGCCAGGACGTGAAGAACTGGATGACCGA 
GACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGAAGGCTCTCGGCCC 
CGCGGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCGGCCACAA 
GGCCCGCGTGCTGGCCGAGGCGATGAGCCAGGTGACGAACCCGGCGACCATCATGATGCA 
GCGCGGCAACTTCCGCAACCAGCGGAAGACCGTCAAGTGCTTCAACTGCGGCAAGGAGGG 
CCACACCGCCAGGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGCGCTGCGGCCGCGA 
GGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCC 
CAGCTACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCC 
CGAGGAGAGCTTCCGCTTCGGCGAGGAGAAGACCACCCCCAGCCAGAAGCAGGAGCCCAT 
CGACAAGGAGCTGTACCCCCTGACCAGCCTGCGCAGCCTGTTCGGCAACGACCCCAGCAG 
CCAGTAAGAATTCAGACTCGAGCAAGTCTAGA 
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gpl60.modSF162.delV2.gag.modSF2 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGA 
GCAGTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTG 
CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGAC 
ACCGAGGTGCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 
GAGATCGTGCTGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
CAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTG 
ACCCCCCTGTGCGTGACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGC 
AGCAACTGGAAGGAGATGGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGGGCGCC 
GGCAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTC 
GAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGAC 
AAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGC 
ATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTG 
GTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAG 
AGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGC 
CCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGC 
AACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 
CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTG 
ATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAAC 
AGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGC 
CGCATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCC 
ATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGC 
GGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGAC 
AACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCC 
CCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCC 
ATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTG 
ACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGC 
GCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAG 
GCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGC 
TGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAG 
AGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAAC 
TACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAG 
CAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAG 
TGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATC 
GTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTC 
CAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGC 
GGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGG 
GACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATC 
GCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGC 
AACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCC 
ATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGC 
CGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAA 
CTCGAGCAAGTCTAGAGAATTCCGCCCCCCCCCCCCCCCCCCCTCTCCCTCCCCCCCCCC 
TAACGTTACTGGCCGAAGCCGCTTGGAATAAGGCCGGTGTGCGTTTGTCTATATGTTATT 
TTCCACCATATTGCCGTCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTT 
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GACGAGCATTCCTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGT 

CGTGAAGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAAC71ACGTCTGTAGCGACCCT 

TTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGT 

ATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTGTGAGTTGGATAGTTGT 

GGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAA 

GGTACCCCATTGTATGGGATCTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTA 

GTCGAGGTTAAAAAAACGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAA 

AACACGATAATACCATGGGCGCCCGCGCCAGCGTGCTGAGCGGCGGCGAGCTGGACAAGT 

GGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGAAGTACAAGCTGAAGCACATCGTGT 

GGGCCAGCCGCGAGCTGGAGCGCTTCGCCGTGAACCCCGGCCTGCTGGAGACCAGCGAGG 

GCTGCCGCCAGATCCTGGGCCAGCTGCAGCCCAGCCTGCAGACCGGCAGCGAGGAGCTGC 

GCAGCCTGTACAACACCGTGGCCACCCTGTACTGCGTGCACCAGCGCATCGACGTCAAGG 

ACACCAAGGAGGCCCTGGAGAAGATCGAGGAGGAGCAGAACAAGTCCAAGAAGAAGGCCC 

AGCAGGCCGCCGCCGCCGCCGGCACCGGCAACAGCAGCCAGGTGAGCCAGAACTACCCCA 

TCGTGCAGAACCTGCAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACG 

CCTGGGTGAAGGTGGTGGAGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCAGCG 

CCCTGAGCGAGGGCGCCACCCCCCAGGACCTGAACACGATGTTGAACACCGTGGGCGGCC 

ACCAGGCCGCCATGCAGATGCTGAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACC 

GCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCA 

GCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGGCTGGATGACCAACAACC 

CCCCCATCCCCGTGGGCGAGATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCG 

TGCGGATGTACAGCCCCACCAGCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCC 

GCGACTACGTGGACCGCTTCTACAAGACCCTGCGCGCTGAGCAGGCCAGCCAGGACGTGA 

AGAACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCC 

TGAAGGCTCTCGGCCCCGCGGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGG 

GCGGCCCCGGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGAGCCAGGTGACGAACCCGG 

CGACCATCATGATGCAGCGCGGCAACTTCCGCAACCAGCGGAAGACCGTCAAGTGCTTCA 

ACTGCGGCAAGGAGGGCCACACCGCCAGGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCT 

GGCGCTGCGGCCGCGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCC 

TGGGCAAGATCTGGCCCAGCTACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCG 

AGCCCACCGCCCCCCCCGAGGAGAGCTTCCGCTTCGGCGAGGAGAAGACCACCCCCAGCC 

AGAAGCAGGAGCCCATCGACAAGGAGCTGTACCCCCTGACCAGCCTGCGCAGCCTGTTCG 

GCAACGACCCCAGCAGCCAGTAAGAATTCAGACTCGAGCAAGTCTAGA 
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1 50 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCT 
51 . 100 

GCTGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGG 
GCTGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGG 
GCTGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGG 
GCTGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGG 
GCTGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGG 
GCTGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGG 
. GCTGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGG 
GCTGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGG 
GCTGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGG 
101 150 
TGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTG 
TGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTG 
TGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTG 
TGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTG 
TGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTG 
TGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTG 
TGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTG 
TGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTG 
TGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTG 
151 200 
TTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGTGCACAACGTGTG 
TTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGTGCACAACGTGTG 
TTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGTGCACAACGTGTG 
TTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGTGCACAACGTGTG 
TTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGTGCACAACGTGTG 
TTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGTGCACAACGTGTG 
TTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGTGCACAACGTGTG 
TTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGTGCACAACGTGTG 
TTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGTGCACAACGTGTG 
201 250 
GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGC 
GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGC 
GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGC 
GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGC 
GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGC 
GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGC 
GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGC 
GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGC 
GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGC 
251 300 
TGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
TGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
TGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
TGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
TGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
TGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
TGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
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TGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
TGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
301 350 
CAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTG 
CAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTG 
CAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTG 
CAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTG 
CAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTG 
CAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTG 
CAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTG 
CAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTG 
CAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTG 
351 400 
CGTGAAGCTGACCCCCCTGTGCGTGACCCTGCACTGCACCAACCTGAAGA 
CGTGAAGCTGACCCCCCTGTGCGTGACCCTGCACTGCACCAACCTGAAGA 

CGTGAAGCTGACCCCCCTGTGCGTG 

CGTGAAGCTGACCCCCCTGTGCGTGACCCTGCACTGCACCAACCTGAAGA 
CGTGAAGCTGACCCCCCTGTGCGTGACCCTGCACTGCACCAACCTGAAGA 
CGTGAAGCTGACCCCCCTGTGCGTGACCCTGCACTGCACCAACCTGAAGA 
CGTGAAGCTGACCCCCCTGTGCGTGACCCTGCACTGCACCAACCTGAAGA 
CGTGAAGCTGACCCCCCTGTGCGTGACCCTGCACTGCACCAACCTGAAGA 
CGTGAAGCTGACCCCCCTGTGCGTGACCCTGCACTGCACCAACCTGAAGA 
401 450 
ACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGATGGACCGCGGCGAG 
ACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGATGGACCGCGGCGAG 

ACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGATGGACCGCGGCGAG 
ACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGATGGACCGCGGCGAG 
ACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGATGGACCGCGGCGAG 
ACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGATGGACCGCGGCGAG 
ACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGATGGACCGCGGCGAG 
ACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGATGGACCGCGGCGAG 
451 500 
ATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGCA 

ATCAAGAACTGCAGCTTCAAGGTGGGC 

GGC 

ATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGCA 
ATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGCA 
ATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGCA 
ATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGCA 
ATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGCA 
ATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGCA 
501 550 
GAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACG 

GAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACG 
GAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACG 
GAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACG 
GAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACG 
GAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACG 
GAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACG 
551 600 
ACAACACCAGCTACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAG 

CAAGCTGATCAACTGCAACACCAGCGTGATCACCCAG 

CAACTGCCAGACCAGCGTGATCACCCAG 

ACAACACCAGCTACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAG 
ACAACACCAGCTACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAG 
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ACAACACCAGCTACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAG 

ACAACACCAGCTACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAG 

ACAACACCAGCTACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAG 

ACAACACCAGCTACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAG 

601 650 

GCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCC 

GCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCC 

GCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCC 

GCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCC 

GCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCC 

GCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCC 

GCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCC 

GCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCC 

GCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCC 

651 700 

CGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCG 

CGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCG 

CGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCG 

CGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCG 

CGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCG 

CGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCG 

CGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCG 

CGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCG 

CGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCG 

701 750 

GCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

751 800 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGT 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGT 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGT 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGT 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGT 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGT 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGT 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGT 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGT 

801 850 

GGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGC 

GGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGC 

GGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGC 

GGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGC 

GGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGC 

GGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGC 

GGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGC 

GGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGC 

GGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGC 

851 900 

AGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACC 

AGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACC 

AGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACC 
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gpl 60. modSFl 62 (1151) 



AGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACC 

AGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACC 

AGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACC 

AGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACC 

AGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACC 

AGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACC 

901 950 

CGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGA 

CGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGA 

CGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGA 

CGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGA 

CGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGA 

CGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGA 

CGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGA 

CGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGA 

CGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGA 

951 1000 

CATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

CATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

CATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

CATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

CATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

CATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

CATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

CATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

CATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

1001 1050 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGC 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGC 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGC 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGC 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGC 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGC 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGC 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGC 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGC 

1051 1100 

AACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGT 

AACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGT 

AACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGT 

AACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGT 

AACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGT 

AACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGT 

AACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGT 

AACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGT 

AACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGT 

1101 H50 

GATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCC 

GATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCC 

GATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCC 

GATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCC 

GATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCC 

GATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCC 

GATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCC 

GATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCC 

GATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCC 

1151 1200 

AGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAAC 



FIG.^A (Sheet 4/9) 



gpl 60. modSFl 62. delV2 
gpl60.modSF162 ,delVlV2 
gpl 40. modSFl 62 
gpl 40. mut. modSFl 62 
gpl4 0 . mut 7 . modSFl 62 
gpl40 .mut 8 .modSF162 
gp!20.modSF162 
Consensus 

gpl60 .modSF162 
gpl60.modSF162 ,delV2 
gpl 60 .modSFl 62 ,delVlV2 
gpl40.modSF162 
gpl40.mut.modSF162 
gpl40 .mut 7 .modSFl 62 
gp!40.mut8.modSF162 
gpl20.modSF162 
Consensus 

gpl60.modSF162 
^ gp!60 > modSFl 62 . delV2 
#160 . modSFl 62 . delVlV2 
HF gpl40.modSF162 
Sj gpl 4 0 . mu t . modSFl 62 
fjl gpl40 .mut 7 .modSF162 
Ifl gpl40 .mut 8 .modSF162 
Zl gpl20.modSF162 
l^. Consensus 

? gpl60.modSF162 
H : gpl 60. modSFl 62. delV2 
IH>1 60 .modSFl 62 . delVlV2 
yj gpl 40. modSFl 62 

Q gpl40 .mut ,modSF162 
?S gpl40 .mut7 .modSF162 
1~ gp!40 .mut 8 .modSF162 
^» gpl 20. modSFl 6 2 

Consensus 

gpl60.modSFl62 
gp!60 .modSFl 62 .delV2 
gpl 60 . modSFl 62 . delVlV2 
gpl 40. modSFl 62 
gpl40 .mut .modSF162 
gpl40.mut7 ,modSF162 
gpl40.mut8 .modSF162 
gpl20. modSFl 62 
Consensus 

gpl60 .modSF162 
gpl60 ,modSF162 .delV2 
gpl 60 .modSFl 62 . delVlV2 
gpl40 ,modSF162 
gpl 40 .mut .modSFl 62 
gpl40 .mut7 .modSFl62 
gpl40 .mut 8 .modSF162 
gpl20.modSF162 
Consensus 



(1070 
(962 
1151 
1151 
1151 
1151 
1151 
1151 

1201 
1120 

1012; 

1201 
1201 
1201 
1201 
1201 
1201 

1251 
1170 
1062 
1251 
1251 
1251 
1251 
1251 
1251 

1301 
1220 
1112 
1301 
1301; 
1301 
1301 
1301 
1301 

1351 
1270 
1162 
1351 
1351 
1351 
1351 
1351 
1351 

1401 
1320 
1212 
1401 
1401 
1401 
1401 
1401 
1401 



AGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAAC 

AGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAAC 

AGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAAC 

AGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAAC 

AGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAAC 

AGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAAC 

AGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAAC 

AGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAAC 

1201 1250 

GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCA 

GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCA 

GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCA 

GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCA 

GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCA 

GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCA 

GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCA 

GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCA 

GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCA 

1251 1300 

GGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCT 

GGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCT 

GGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCT 

GGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCT 

GGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCT 

GGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCT 

GGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCT 

GGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCT 

GGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCT 

1301 1350 

GCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAG 

GCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAG 

GCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAG 

GCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAG 

GCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAG 

GCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAG 

GCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAG 

GCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAG 

GCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAG 

1351 1400 

ATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGA 

ATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGA 

ATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGA 

ATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGA 

ATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGA 

ATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGA 

ATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGA 

ATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGA 

ATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGA 

1401 1450 

CAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCC 

CAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCC 

CAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCC 

CAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCC 

CAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCC 

CAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCC 

CAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCC 

CAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCC 

CAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCC 
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1451 1500 

TGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAG 
TGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAG 
TGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAG 
TGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAG 
TGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAG 
TGGGCGTGGCCCCCACCAAGGCCATCAGCAGCGTGGTGCAGAGCGAGAAG 
TGGGCGTGGCCCCCACCATCGCCATCAGCAGCGTGGTGCAGAGCGAGAAG 
TGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAG 
TGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAG 
1501 1550 
CGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGG 
CGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGG 
CGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGG 
CGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGG 
AGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGG 
AGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGG 
AGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGG 

CGC TAACTCGAG 

CGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGG 
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CAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGC 
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CAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGC 
CAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGC 
CAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGC 
CAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGC 
CAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGC 

CAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGC 
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TGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAG 
TGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAG 
TGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAG 
TGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAG 
TGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAG 
TGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAG 
TGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAG 

TGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAG 
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GCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCA 
GCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCA 
GCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCA 
GCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCA 
GCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCA 
GCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCA 
GCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCA 

GCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCA 
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GGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGG 
GGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGG 
GGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGG 
GGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGG 
GGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGG 
GGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGG 
GGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGG 
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(2001) GTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGG 

(1920) GTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGG 

(1812) GTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGG 

(2001) GTGGCTGTGGTACATCTAACTCGAG 

(2001) GTGGCTGTGGTACATCTAACTCGAG 
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(2001) GTGGCTGTGGTACATCTAACTCGAG 

2051 2100 
(2051) GCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAG 
(1970) GCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAG 
(1862) GCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAG 
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(2101) GGCT ACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCC 
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(1912) GGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCC 
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(2101) 

2151 2200 
(2151) CGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACC 
(2070) CGACCGCCCCGAGGGC ATCGAGGAGGAGGGCGGCG AGCGCGACCGCGACC 
(1962 ) CGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACC 

(2026) 
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(2012 ) GCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTG 
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(1513) 
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(2301) CGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGA 

(2220) CGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGA 

(2112) CGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGA 
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(2301) 

2351 2400 
(2351) AGTACTGGGGC AACCTGCTGCAGT ACTGGATCCAGGAGCTGAAGAACAGC 

(2270) AGT ACTGGGGCAACCTGCTGC AGT ACTGGATCCAGGAGCTGAAGAACAGC 
(2162) AGTACTGGGGCAACCTGCTGCAGT ACTGGATCCAGGAGCTGAAGAACAGC 

(2026) . 
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(2026) 

(1513) 

(2351) 

2401 2450 
(2401) GCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGA 
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(2501) CCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGT AACTCGAG 
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(2312) CCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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FIG. ^7 



HIV-1SF2 wt RT (PISPIET-->GIRKVL) 

CCCATTAGTCCTATTGAAACTGTACCAGTAAAATTAAAGCCAGGAATGGATGGCCCAAAA 

GTTAAGCAATGGCCATTGACAGAAGAAAAAATAAAAGCATTAGTAGAGATATGTACAGAA 

ATGGAAAAGGAAGGGAAAATTTCAAAAATTGGGCCTGAAAATCCATACAATACTCCAGTA 

TTTGCTATAAAGAAAAAAGACAGTACTAAATGGAGAAAACTAGTAGATTTCAGAGAACTT 

AATAAAAGAACTCAAGACTTCTGGGAAGTTCAGTTAGGAATACCACACCCCGCAGGGTTA 

AAAAAGAAAAAATCAGTAACAGTATTGGATGTGGGTGATGCATACTTTTCAGTTCCCTTA 

GATAAAGACTTTAGAAAGTATACTGCATTTACCATACCTAGTATAAACAATGAGACACCA 

GGGATTAGATATCAGTACAATGTGCTGCCACAGGGATGGAAAGGATCACCAGCAATATTC 

CyVAAGTAGCATGACAAAAATCTTAGAGCCTTTTAGAAAACAGAATCCAGACATAGTTATC 

TATCAAtacatggatgatTTGTATGTAGGATCTGACTTAGAAATAGGGCAGCATAGAACA 

AAAATAGAGGAACTGAGACAGCATCTGTTGAGGTGGGGATTTACCACACCAGACAAAAAA 

CATCAGAAAGAACCTCCATTCCTTtggatgggttatGAACTCCATCCTGATAAATGGACA 

GTACAGCCTATAATGCTGCCAGAAAAAGACAGCTGGACTGTCAATGACATACAGAAGTTA 

GTGGGAAAATTGAATTGGGCAAGTCAGATTTATGCAGGGATTAAAGTAAAGCAGTTATGT 

AAACTCCTTAGAGGAACCAAAGCACTAACAGAAGTAATACCACTAACAGAAGAAGCAGAG 

CTAGAACTGGCAGAAAACAGGGAGATTCTAAAAGAACCAGTACATGAAGTATATTATGAC 

CCATCAAAAGACTTAGTAGCAGAAATACAGAAGCAGGGGCAAGGCCAATGGACATATCAA 

ATTTATCAAGAGCCATTTAAAAATCTGAAAACAGGAAAGTATGCAAGGATGAGGGGTGCC 

CACACTAATGATGTAAAACAGTTAACAGAGGCAGTGCAAAAAGTATCCACAGAAAGCATA 

GTAATATGGGGAAAGATTCCTAAATTTAAACTACCCATACAAAAGGAAACATGGGAAGCA 

TGGTGGATGGAGTATTGGCAAGCTACCTGGATTCCTGAGTGGGAGTTTGTCAATACCCCT 

CCCTTAGTGAAATTATGGTACCAGTTAGAGAAAGAACCCATAGTAGGAGCAGAAACTTTC 

TATGTAGATGGGGCAGCTAATAGGGAGACTAAATTAGGAAAAGCAGGATATGTTACTGAC 

AGAGGAAGACAAAAAGTTGTCTCCATAGCTGACACAACAAATCAGAAGACTGAATTACAA 

GCAATTCATCTAGCTTTGCAGGATTCGGGATTAGAAGTAAACATAGTAACAGACTCACAA 

TATGCATTAGGAATCATTCAAGCACAACCAGATAAGAGTGAATCAGAGTTAGTCAGTCAA 

ATAATAGAGCAGTTAATAAAAAAGGAAAAGGTCTACCTGGCATGGGTACCAGCACACAAA 

GGAATTGGAGGAAATGAACAAGTAGATAAATTAGTCAGTGCTGGAATCAGGAAAGTACTA 



FIG. 68 (SEQ ID NO: 77) 



GagProtMod.SF2 (GP1) 

GTCGACGCCACCATGGGCGCCCGCGCCAGCGTGCTGAGCGGCGGCGAGCTGGACAAGTGG 

GAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGAAGTACAAGCTGAAGCACATCGTGTGG 

GCCAGCCGCGAGCTGGAGCGCTTCGCCGTGAACCCCGGCCTGCTGGAGACCAGCGAGGGC 

TGCCGCCAGATCCTGGGCCAGCTGCAGCCCAGCCTGCAGACCGGCAGCGAGGAGCTGCGC 

AGCCTGTACAACACCGTGGCCACCCTGTACTGCGTGCACCAGCGCATCGACGTCAAGGAC 

ACCAAGGAGGCCCTGGAGAAGATCGAGGAGGAGCAGAACAAGTCCAAGAAGAAGGCCCAG 

CAGGCCGCCGCCGCCGCCGGCACCGGCAACAGCAGCCAGGTGAGCCAGAACTACCCCATC 

GTGCAGAACCTGCAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCC 

TGGGTGAAGGTGGTGGAGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCAGCGCC 

CTGAGCGAGGGCGCCACCCCCCAGGACCTGAACACGATGTTGAACACCGTGGGCGGCCAC 

CAGGCCGCCATGCAGATGCTGAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACCGC 

GTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCAGC 

GACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGGCTGGATGACCAACAACCCC 

CCCATCCCCGTGGGCGAGATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTG 

CGGATGTACAGCCCCACCAGCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGC 

GACTACGTGGACCGCTTCTACAAGACCCTGCGCGCTGAGCAGGCCAGCCAGGACGTGAAG 

AACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTG 

AAGGCTCTCGGCCCCGCGGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGC 

GGCCCCGGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGAGCCAGGTGACGAACCCGGCG 

ACCATCATGATGCAGCGCGGCAACTTCCGCAACCAGCGGAAGACCGTCAAGTGCTTCAAC 

TGCGGCAAGGAGGGCCACACCGCCAGGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGG 

CGCTGCGGCCGCGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTTTTTA 

GGGAAGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGACCAGAG 

CCAACAGCCCCACCAGAAGAGAGCTTCAGGTTTGGGGAGGAGAAAACAACTCCCTCTCAG 

AAGCAGGAGCCGATAGACAAGGAACTGTATCCTTTAACTTCCCTCAGATCACTCTTTGGC 

AACGACCCCTCGTCACAGTAAGGATCGGCGGCCAGCTCAAGGAGGCGCTGCTCGACACCG 

GCGCCGACGACACCGTGCTGGAGGAGATGAACCTGCCCGGCAAGTGGAAGCCCAAGATGA 

TCGGCGGGATCGGGGGCTTCATCAAGGTGCGGCAGTACGACCAGATCCCCGTGGAGATCT 

GCGGCCACAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGTGAACATCATCGGCC 

GCAACCTGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACGG 

TGCCCGTGAAGCTGAAGCCGGGGATGGACGGCCCCAAGGTCAAGCAGTGGCCCCTGTAAG 

AATTC 



FIG. 69 (SEQ ID NO: 7 8) 



GagProtMod.SF2 (GP2) 

GTCGACGCCACCATGGGCGCCCGCGCCAGCGTGCTGAGCGGCGGCGAGCTGGACAAGTGG 
GAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGAAGTACAAGCTGAAGCACATCGTGTGG 
GCCAGCCGCGAGCTGGAGCGCTTCGCCGTGAACCCCGGCCTGCTGGAGACCAGCGAGGGC 
TGCCGCCAGATCCTGGGCCAGCTGCAGCCCAGCCTGCAGACCGGCAGCGAGGAGCTGCGC 
AGCCTGTACAACACCGTGGCCACCCTGTACTGCGTGCACCAGCGCATCGACGTCAAGGAC 
ACCAAGGAGGCCCTGGAGAAGATCGAGGAGGAGCAGAACAAGTCCAAGAAGAAGGCCCAG 
CAGGCCGCCGCCGCCGCCGGCACCGGCAACAGCAGCCAGGTGAGCCAGAACTACCCCATC 
GTGCAGAACCTGCAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCC 
TGGGTGAAGGTGGTGGAGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCAGCGCC 
CTGAGCGAGGGCGCCACCCCCCAGGACCTGAACACGATGTTGAACACCGTGGGCGGCCAC 
CAGGCCGCCATGCAGATGCTGAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACCGC 
GTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCAGC 
GAC ATCGC CGGCAC C AC C AGC AC C CTGCAGGAGC AGATCGGCTGGATGAC C AACAAC C C C 
CCCATCCCCGTGGGCGAGATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTG 
CGGATGTACAGCCCCACCAGCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGC 
GACTACGTGGACCGCTTCTACAAGACCCTGCGCGCTGAGCAGGCCAGCCAGGACGTGAAG 
AACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTG 
AAGGCTCTCGGCCCCGCGGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGC 
GGCCCCGGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGAGCCAGGTGACGAACCCGGCG 
ACCATCATGATGCAGCGCGGCAACTTCCGCAACCAGCGGAAGACCGTCAAGTGCTTCAAC 
TGCGGCAAGGAGGGCCACACCGCCAGGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGG 
CGCTGCGGCCGCGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTTTTTA 
GGGAAGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGACCAGAG 
CCAACAGCCCCACCAGAAGAGAGCTTCAGGTTTGGGGAGGAGAAAACAACTCCCTCTCAG 
AAGCAGGAGCCGATAGACAAGGAACTGTATCCTTTAACTTCCCTCAGATCACTCTTTGGC 
AACGACCCCTCGTCACAGTAAGGATCGGGGGGCAACTCAAGGAAGCGCTGCTCGATACAG 
GAGCAGATGATACAGTATTAGAAGAAATGAATTTGCCAGGAAAATGGAAACCAAAAATGA 
TAGGGGGGATCGGGGGCTTCATCAAGGTGAGGCAGTACGACCAGATACCTGTAGAAATCT 
GTGGACATAAAGCTATAGGTACAGTATTAGTAGGACCTACACCTGTCAACATAATTGGAA 
GAAATCTGTTGACCCAGATCGGCTGCACCTTGAACTTCCCCATCAGCCCTATTGAGACGG 
TGCCCGTGAAGTTGAAGCCGGGGATGGACGGCCCCAAGGTCAAGCAATGGCCATTGTAAG 
AATTC 



FIG. 70 (SEQ ID NO: 79) 



FS(+)_ProtInact_RTopt_YM 

GCGGCCGCGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTTTTTAGGGA 
AGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGACCAGAGCCAA 
CAGCCCCACCAGAAGAGAGCTTCAGGTTTGGGGAGGAGAAAACAACTCCCTCTCAGAAGC 
AGGAGCCGATAGACAAGGAACTGTATCCTTTAACTTCCCTCAGATCACTCTTTGGCAACG 
ACCCCTCGTCACAATAAGGATCGGGGGGCAACTCAAGGAAGCGCTGCTCGATACAGGAGC 
AGATGATACAGTATTAGAAGAAATGAATTTGCCAGGAAAATGGAAACCAAAAATGATAGG 
GGGGATCGGGGGCTTCATCAAGGTGAGGCAGTACGACCAGATACCTGTAGAAATCTGTGG 
ACATAAAGCTATAGGTACAGTATTAGTAGGACCTACACCTGTCAACATAATTGGAAGAAA 
TCTGTTGACCCAGATCGGCTGCACCTTGAACTTCCCCATCAGCCCTATTGAGACGGTGCC 
CGTGAAGTTGAAGCCGGGGATGGACGGCCCCAAGGTCAAGCAATGGCCATTGACCGAGGA 
GAAGATCAAGGCCCTGGTGGAGATCTGCACCGAGATGGAGAAGGAGGGCAAGATCAGCAA 
GATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCAC 
CAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGA 
GGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGCT 
GGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACAAGGACTTCCGCAAGTACACCGC 
CTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCT 
GCCCCAGGGCTGGAAGGGCAGCCCCGCCATCTTCCAGAGCAGCATGACCAAGATCCTGGA 
GCCCTTCCGCAAGCAGAACCCCGACATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAG 
CGACCTGGAGATCGGCCAGCACCGCACCAAGATCGAGGAGCTGCGCCAGCACCTGCTGCG 
CTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGTGGATGGG 
CTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCATGCTGCCCGAGAAGGACAG 
CTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTA 
CGCCGGCATCAAGGTGAAGCAGCTGTGCAAGCTGCTGCGCGGCACCAAGGCCCTGACCGA 
GGTGATCCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAA 
GGAGCCCGTGCACGAGGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAA 
GCAGGGCCAGGGCCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGAC 
CGGCAAGTACGCCCGCATGCGCGGCGCCCACACCAACGACGTGAAGCAGCTGACCGAGGC 
CGTGCAGAAGGTGAGCACCGAGAGCATCGTGATCTGGGGCAAGATCCCCAAGTTCAAGCT 



FIG. 71 (SEQ ID NO: 80) 



GCCCATCCAGAAGGAGACCTGGGAGGCCTGGTGGATGGAGTACTGGCAGGCCACCTGGAT 
CCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAA 
GGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCAA 
GCTGGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGGTGGTGAGCATCGCCGA 
CACCACCAACCAGAAGACCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACAGCGGCCT 
GGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGA 
CAAGAGCGAGAGCGAGCTGGTGAGCCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGT 
GTACCTGGCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCT 
GGTGAGCGCCGGCATCCGCAAGGTGCTGTTCCTGAACGGCATCGATGGCGGCATCGTGAT 
CTACCAGTACATGGACGACCTGTACGTGGGCAGCGGCGGCCCTAGGATCGATTAAAAGCT 
TCCCGGGGCTAGCACCGGTGAATTC 



FIG. 71 (CONT'D.) (SEQ ID N0:80) 



FS(+)_ProtInact_RTopt_YMWM 

GCGGCCGCGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTTTTTAGGGA 

AGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGACCAGAGCCAA 

CAGCCCCACCAGAAGAGAGCTTCAGGTTTGGGGAGGAGAAAACAACTCCCTCTCAGAAGC 

AGGAGCCGATAGACAAGGAACTGTATCCTTTAACTTCCCTCAGATCACTCTTTGGCAACG 

ACCCCTCGTCACAATAAGGATCGGGGGGCAACTCAAGGAAGCGCTGCTCGATACAGGAGC 

AGATGATACAGTATTAGAAGAAATGAATTTGCCAGGAAAATGGAAACCAAAAATGATAGG 

GGGGATCGGGGGCTTCATCAAGGTGAGGCAGTACGACCAGATACCTGTAGAAATCTGTGG 

ACATAAAGCTATAGGTACAGTATTAGTAGGACCTACACCTGTCAACATAATTGGAAGAAA 

TCTGTTGACCCAGATCGGCTGCACCTTGAACTTCCCCATCAGCCCTATTGAGACGGTGCC 

CGTGAAGTTGAAGCCGGGGATGGACGGCCCCAAGGTCAAGCAATGGCCATTGACCGAGGA 

GAAGATCAAGGCCCTGGTGGAGATCTGCACCGAGATGGAGAAGGAGGGCAAGATCAGCAA 

GATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCAC 

CAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGA 

GGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGCT 

GGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACAAGGACTTCCGCAAGTACACCGC 

CTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCT 

GCCCCAGGGCTGGAAGGGCAGCCCCGCCATCTTCCAGAGCAGCATGACCAAGATCCTGGA 

GCCCTTCCGCAAGCAGAACCCCGACATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAG 

CGACCTGGAGATCGGCCAGCACCGCACCAAGATCGAGGAGCTGCGCCAGCACCTGCTGCG 

CTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGCCCATCGA 

GCTGCACCCCGACAAGTGGACCGTGCAGCCCATCATGCTGCCCGAGAAGGACAGCTGGAC 

CGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTACGCCGG 

CATCAAGGTGAAGCAGCTGTGCAAGCTGCTGCGCGGCACCAAGGCCCTGACCGAGGTGAT 

CCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAAGGAGCC 

CGTGCACGAGGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAGGG 

CCAGGGCCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAA 

GTACGCCCGCATGCGCGGCGCCCACACCAACGACGTGAAGCAGCTGACCGAGGCCGTGCA 

GAAGGTGAGCACCGAGAGCATCGTGATCTGGGGCAAGATCCCCAAGTTCAAGCTGCCCAT 



FIG. 72 (SEQID N0:81) 



CCAGAAGGAGACCTGGGAGGCCTGGTGGATGGAGTACTGGCAGGCCACCTGGATCCCCGA 
GTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCC 
CATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCAAGCTGGG 
CAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGGTGGTGAGCATCGCCGACACCAC 
CAACCAGAAGACCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACAGCGGCCTGGAGGT 
GAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAG 
CGAGAGCGAGCTGGTGAGCCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACCT 
GGCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGAG 
CGCCGGCATCCGCAAGGTGCTGTTCCTGAACGGCATCGATGGCGGCATCGTGATCTACCA 
GTACATGGACGACCTGTACGTGGGCAGCGGCGGCCCTAGGATCGATTAAAAGCTTCCCGG 
GGCTAGCACCGGTGAATTC 



FIG. 72 {CONT'D.) (SEQ ID N0:81) 



FS(-)_ProtMod_RTopt_YM 

GCGGCCGCGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTTCTTCCGCG 

AGGACCTGGCCTTCCTGCAGGGCAAGGCCCGCGAGTTCAGCAGCGAGCAGACCCGCGCCA 

ACAGCCCCACCCGCCGCGAGCTGCAGGTGTGGGGCGGCGAGAACAACAGCCTGAGCGAGG 

CCGGCGCCGACCGCCAGGGCACCGTGAGCTTCAACTTCCCCCAGATCACCCTGTGGCAGC 

GCCCCCTGGTGACCATCAGGATCGGCGGCCAGCTCAAGGAGGCGCTGCTCGACACCGGCG 

CCGACGACACCGTGCTGGAGGAGATGAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCG 

GCGGGATCGGGGGCTTCATCAAGGTGCGGCAGTACGACCAGATCCCCGTGGAGATCTGCG 

GCCACAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGTGAACATCATCGGCCGCA 

ACCTGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACGGTGC 

CCGTGAAGCTGAAGCCGGGGATGGACGGCCCCAAGGTCAAGCAGTGGCCCCTGACCGAGG 

AGAAGATCAAGGCCCTGGTGGAGATCTGCACCGAGATGGAGAAGGAGGGCAAGATCAGCA 

AGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCA 

CCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGG 

AGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGC 

TGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACAAGGACTTCCGCAAGTACACCG 

CCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGC 

TGCCCCAGGGCTGGAAGGGCAGCCCCGCCATCTTCCAGAGCAGCATGACCAAGATCCTGG 

AGCCCTTCCGCAAGCAGAACCCCGACATCGTGATCTACCAGGCCCCCCTGTACGTGGGCA 

GCGACCTGGAGATCGGCCAGCACCGCACCAAGATCGAGGAGCTGCGCCAGCACCTGCTGC 

GCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGTGGATGG 

GCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCATGCTGCCCGAGAAGGACA 

GCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCT 

ACGCCGGCATCAAGGTGAAGCAGCTGTGCAAGCTGCTGCGCGGCACCAAGGCCCTGACCG 

AGGTGATCCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGA 

AGGAGCCCGTGCACGAGGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGA 

AGCAGGGCCAGGGCCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGA 

CCGGCAAGTACGCCCGCATGCGCGGCGCCCACACCAACGACGTGAAGCAGCTGACCGAGG 

CCGTGCAGAAGGTGAGCACCGAGAGCATCGTGATCTGGGGCAAGATCCCCAAGTTCAAGC 



FIG. 73 (SEQ ID N0:82) 



TGCCCATCCAGAAGGAGACCTGGGAGGCCTGGTGGATGGAGTACTGGCAGGCCACCTGGA 
TCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGA 
AGGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCA 
AGCTGGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGGTGGTGAGCATCGCCG 
ACACCACCAACCAGAAGACCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACAGCGGCC 
TGGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCG 
ACAAGAGCGAGAGCGAGCTGGTGAGCCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGG 
TGTACCTGGCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGC 
TGGTGAGCGCCGGCATCCGCAAGGTGCTGTTCCTGAACGGCATCGATGGCGGCATCGTGA 
TCTACCAGTACATGGACGACCTGTACGTGGGCAGCGGCGGCCCTAGGATCGATTAAAAGC 
TTCCCGGGGCTAGCACCGGTGAATTC 



FIG. 73 (CONT'D.) (SEQ ID NO: 8 2) 



FS(-)_ProtMod_RTopt_YMWM 

GCGGCCGCGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTTCTTCCGCG 

AGGACCTGGCCTTCCTGCAGGGCAAGGCCCGCGAGTTCAGCAGCGAGCAGACCCGCGCCA 

ACAGCCCCACCCGCCGCGAGCTGCAGGTGTGGGGCGGCGAGAACAACAGCCTGAGCGAGG 

CCGGCGCCGACCGCCAGGGCACCGTGAGCTTCAACTTCCCCCAGATCACCCTGTGGCAGC 

GCCCCCTGGTGACCATCAGGATCGGCGGCCAGCTCAAGGAGGCGCTGCTCGACACCGGCG 

CCGACGACACCGTGCTGGAGGAGATGAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCG 

GCGGGATCGGGGGCTTCATCAAGGTGCGGCAGTACGACCAGATCCCCGTGGAGATCTGCG 

GCCACAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGTGAACATCATCGGCCGCA 

ACCTGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACGGTGC 

CCGTGAAGCTGAAGCCGGGGATGGACGGCCCCAAGGTCAAGCAGTGGCCCCTGACCGAGG 

AGAAGATCAAGGCCCTGGTGGAGATCTGCACCGAGATGGAGAAGGAGGGCAAGATCAGCA 

AGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCA 

CCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGG 

AGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGC 

TGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACAAGGACTTCCGCAAGTACACCG 

CCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGC 

TGCCCCAGGGCTGGAAGGGCAGCCCCGCCATCTTCCAGAGCAGCATGACCAAGATCCTGG 

AGCCCTTCCGCAAGCAGAACCCCGACATCGTGATCTACCAGGCCCCCCTGTACGTGGGCA 

GCGACCTGGAGATCGGCCAGCACCGCACCAAGATCGAGGAGCTGCGCCAGCACCTGCTGC 

GCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGCCCATCG 

AGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCATGCTGCCCGAGAAGGACAGCTGGA 

CCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTACGCCG 

GCATCAAGGTGAAGCAGCTGTGCAAGCTGCTGCGCGGCACCAAGGCCCTGACCGAGGTGA 

TCCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAAGGAGC 

CCGTGCACGAGGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAGG 

GCCAGGGCCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCA 

AGTACGCCCGCATGCGCGGCGCCCACACCAACGACGTGAAGCAGCTGACCGAGGCCGTGC 

AGAAGGTGAGCACCGAGAGCATCGTGATCTGGGGCAAGATCCCCAAGTTCAAGCTGCCCA 



FIG. 74 (SEQ ID N0:83) 



TCCAGAAGGAGACCTGGGAGGCCTGGTGGATGGAGTACTGGCAGGCCACCTGGATCCCCG 
AGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGC 
CCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCAAGCTGG 
GCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGGTGGTGAGCATCGCCGACACCA 
CCAACCAGAAGACCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACAGCGGCCTGGAGG 
TGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGA 
GCGAGAGCGAGCTGGTGAGCCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACC 
TGGCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGA 
GCGCCGGCATCCGCAAGGTGCTGTTCCTGAACGGCATCGATGGCGGCATCGTGATCTACC 
AGTACATGGACGACCTGTACGTGGGCAGCGGCGGCCCTAGGATCGATTAAAAGCTTCCCG 
GGGCTAGCACCGGTGAATTC 



FIG. 74 (CONT'D.) (SEO ID NO: 83) 



FS(-)_ProtMod_RTopt(+) 

GCGGCCGCGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTTCTTCCGCG 

AGGACCTGGCCTTCCTGCAGGGCAAGGCCCGCGAGTTCAGCAGCGAGCAGACCCGCGCCA 

ACAGCCCCACCCGCCGCGAGCTGCAGGTGTGGGGCGGCGAGAACAACAGCCTGAGCGAGG 

CCGGCGCCGACCGCCAGGGCACCGTGAGCTTCAACTTCCCCCAGATCACCCTGTGGCAGC 

GCCCCCTGGTGACCATCAGGATCGGCGGCCAGCTCAAGGAGGCGCTGCTCGACACCGGCG 

CCGACGACACCGTGCTGGAGGAGATGAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCG 

GCGGGATCGGGGGCTTCATCAAGGTGCGGCAGTACGACCAGATCCCCGTGGAGATCTGCG 

GCCACAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGTGAACATCATCGGCCGCA 

ACCTGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACGGTGC 

CCGTGAAGCTGAAGCCGGGGATGGACGGCCCCAAGGTCAAGCAGTGGCCCCTGACCGAGG 

AGAAGATCAAGGCCCTGGTGGAGATCTGCACCGAGATGGAGAAGGAGGGCAAGATCAGCA 

AGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCA 

CCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGG 

AGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGC 

TGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACAAGGACTTCCGCAAGTACACCG 

CCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGC 

TGCCCCAGGGCTGGAAGGGCAGCCCCGCCATCTTCCAGAGCAGCATGACCAAGATCCTGG 

AGCCCTTCCGCAAGCAGAACCCCGACATCGTGATCTACCAGTACATGGACGACCTGTACG 

TGGGCAGCGACCTGGAGATCGGCCAGCACCGCACCAAGATCGAGGAGCTGCGCCAGCACC 

TGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGT 

GGATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCATGCTGCCCGAGA 

AGGACAGCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCC 

AGATCTACGCCGGCATCAAGGTGAAGCAGCTGTGCAAGCTGCTGCGCGGCACCAAGGCCC 

TGACCGAGGTGATCCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGA 

TCCTGAAGGAGCCCGTGCACGAGGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGA 

TCCAGAAGCAGGGCCAGGGCCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACC 

TGAAGACCGGCAAGTACGCCCGCATGCGCGGCGCCCACACCAACGACGTGAAGCAGCTGA 

CCGAGGCCGTGCAGAAGGTGAGCACCGAGAGCATCGTGATCTGGGGCAAGATCCCCAAGT 

TCAAGCTGCCCATCCAGAAGGAGACCTGGGAGGCCTGGTGGATGGAGTACTGGCAGGCCA 

CCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGC 

TGGAGAAGGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCG 



FIG. 75 (SEQ ID N0:84) 



AGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGGTGGTGAGCA 
TCGCCGACACCACCAACCAGAAGACCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACA 
GCGGCCTGGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCC 
AGCCCGACAAGAGCGAGAGCGAGCTGGTGAGCCAGATCATCGAGCAGCTGATCAAGAAGG 
AGAAGGTGTACCTGGCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGG 
ACAAGCTGGTGAGCGCCGGCATCCGCAAGGTGCTGTTCCTGAACGGCATCGATGGCGGCA 
TCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCAGCGGCGGCCCTAGGATCGATT 
AAAAGCTTCCCGGGGCTAGCACCGGTGAATTC 



FIG. 75 (CONT'D.) (SEQ ID NO: 84) 



Tat_wt_SF162 (wildtype) 

ATGGAGCCAGTAGATCCTAGATTAGAGCCCTGGAAGCATCCAGGAAGTCAGCCTAAGA 

CTGCTTGTACAAATTGCTATTGTAAAAAGTGTTGCTTTCATTGCCAAGTTTGTTTCATAAC 

AAAAGGCTTAGGCATCTCCTATGGCAGGAAGAAGCGGAGACAGCGACGAAGAGCTCCT 

CCAGACAGTGAGGTTCATCAAGTTTCTCTACCAAAGCAACCCGCTTCCCAGCCCCAAGG 

GGACCCGACAGGCCCGAAGGAATCGAAGAAGAAGGTGGAGAGAGAGACAGAGACAGA 

TCCAGTCCATTAG 



FIG. 76 (SEQ ID N0:85) 



Tat_SF162 

MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKGLGISYGRKKRRQRRRAPPDSE 
VHQVSLPKQPASQPQGDPTGPKESKKKVERETETDPVH 



FIG. 77 (SEQ ID N0:86) 



Tat_SF162_opt 



ATGGAGCCCGTGGACCCCCGCCTGGAGCCCTGGAAGCACCCCGGCAGCCAGCCCAAGAC 

CGCCTGCACCAACTGCTACTGCAAGAAGTGCTGCTTCCACTGCCAGGTGTGCTTCATCACC 

AAGGGCCTGGGCATCAGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGCCGCGCCCCCCC 

CGACAGCGAGGTGCACCAGGTGAGCCTGCCCAAGCAGCCCGCCAGCCAGCCCCAGGGCG 

ACCCCACCGGCCCCAAGGAGAGCAAGAAGAAGGTGGAGCGCGAGACCGAGACCGACCCC 

GTGCACTAG 



FIG. 78 (SEQ ID NO: 87) 



Tat_Cys22_SF1 62_opt 



ATGGAGCCCGTGGACCCCCGCCTGGAGCCCTGGAAGCACCCCGGCAGCCAGCCCAAGAC 

CGCCgGCACCAACTGCTACTGCAAGAAGTGCTGCTTCCACTGCCAGGTGTGCTTCATCACCA 

AGGGCCTGGGCATCAGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGCCGCGCCCCCCCC 

GACAGCGAGGTGCACCAGGTGAGCCTGCCCAAGCAGCCCGCCAGCCAGCCCCAGGGCGA 

CCCCACCGGCCCCAAGGAGAGCAAGAAGAAGGTGGAGCGCGAGACCGAGACCGACCCCG 

TGCACTAG 



FIG. 79 (SEQ ID N0:88) 
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TataminoSF162.opt 



ATGGAGCCCGTGGACCCCCGCCTGGAGCCCTGGAAGCACCCCGGCAGCCAGCCCAA 
GACCGCCTGCACCAACTGCTACTGCAAGAAGTGCTGCTTCCACTGCCAGGTGTGCTT 
CATCACCAAGGGCCTGGGCATCAGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGC 



FIG. 81 (SEQ ID NO:89) 



Tat_Cys22_SF162 

MEPVDPRLEPWKHPGSQPKTAGTNCYCKKCCFHCQVCFITKGLGISYGRKKRRQRRRAPPDSE 
VHQVSLPKQPASQPQGDPTGPKESKKKVERETETDPVHZ 



FIG. 82 (SEQ ID N0:90) 



AttyDktNo. 1621.002 
2302-1621 

COMBINED DECLARATION AND POWER OF ATTORNEY 
FOR UTILITY PATENT APPLICATION 

AS A BELOW-NAMED INVENTOR, I HEREBY DECLARE THAT: 

My residence, post office address and citizenship are as stated below next to my name. 

I believe I am the original, first and sole inventor (if only one name is listed below) or an original, 
first and joint inventor (if more than one name is listed below) of the subject matter which is 
claimed and for which a patent is sought on the invention entitled: IMPROVED EXPRESSION OF 
HIV POLYPEPTIDES AND PRODUCTION OF VIRUS-LIKE PARTICLES the specification of 
which 

X is attached hereto 
was filed on 

and assigned Serial No. and was amended on . 

I HAVE REVIEWED AND UNDERSTAND THE CONTENTS OF THE ABOVE-IDENTIFIED 
SPECIFICATION, INCLUDING THE CLAIMS, AS AMENDED BY ANY AMENDMENT 
REFERRED TO ABOVE. 

I acknowledge and understand that I am an individual who has a duty to disclose information which 
is material to the patentability of the claims of this application in accordance with Title 37, Code of 
Federal Regulations, §§ 1.56(a) and (b) which state: 

(a) A patent by its very nature is affected with a public interest. The public interest is 
best served, and the most effective patent examination occurs when, at the time an 
application is being examined, the Office is aware of and evaluates the teachings of 
all information material to patentability. Each individual associated with the filing 
and prosecution of a patent application has a duty of candor and good faith in dealing 
with the Office, which includes a duty to disclose to the Office all information 
known to that individual to be material to patentability as defined in this section. 
The duty to disclose information exists with respect to each pending claim until the 
claim is canceled or withdrawn from consideration, or the application becomes 
abandoned. Information material to the patentability of a claim that is canceled or 
withdrawn from consideration need not be submitted if the information is not 
material to the patentability of any claim remaining under consideration in the 
application. There is no duty to submit information which is not material to the 
patentability of any existing claim. The duty to disclose all information known to be 
material to patentability is deemed to be satisfied if all information known to be 
material to patentability of any claim issued in a patent was cited by the Office or 
submitted to the Office in the manner prescribed by §§ 1 .97(b)-(d) and 1 .98. 
However, no patent will be granted on an application in connection with which fraud 
on the Office was practiced or attempted or the duty of disclosure was violated 
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through bad faith or intentional misconduct. The Office encourages applicants to 
carefully examine: 

(1) prior art cited in search reports of a foreign patent office in a counterpart 
application, and 

(2) the closest information over which individuals associated with the filing 
or prosecution of a patent application believe any pending claim patentably defines, 
to make sure that any material information contained therein is disclosed to the 
Office. 

(b) Under this section, information is material to patentability when it is not 
cumulative to information already of record or being made of record in the 
application, and 

(1) It establishes, by itself or in combination with other information, a prima 
facie case of unpatentability of a claim; or 

(2) It refutes, or is inconsistent with, a position the applicant takes in: 

(i) Opposing an argument of unpatentability relied on by the Office, 

or 

(ii) Asserting an argument of patentability. 

A prima facie case of unpatentability is established when the information compels a 
conclusion that a claim is unpatentable under the preponderance of evidence, burden- 
of-proof standard, giving each term in the claim its broadest reasonable construction 
consistent with the specification, and before any consideration is given to evidence 
which may be submitted in an attempt to establish a contrary conclusion of 
patentability. 

1 do not know and do not believe this invention was ever known or used in the United States of 
America before my or our invention thereof, or patented or described in any printed publication in 
any country before my or our invention thereof or more than one year prior to said application. This 
invention was not in public use or on sale in the United States of America more than one year prior 
to this application. This invention has not been patented or made the subject of an inventor's 
certificate issued before the date of this application in any country foreign to the United States of 
America on any application filed by me or my legal representatives or assigns more than six months 
prior to this application. 

I hereby claim priority benefits under Title 35, United States Code § 1 19(e)(1) of any United States 
provisional application(s) for patent as indicated below and have also identified below any 
application for patent on this invention having a filing date before that of the application for patent 
on which priority is claimed: 

Date of Filing Priority 
Application No. rdav/month/vear) Claimed 

60/1 14,495 31 December 1998 Yes XNo _ 

60/168'471 01 December 1999 Yes _X_ No _ 
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I hereby appoint the following attorneys and agents to prosecute that application and to transact all 
business in the Patent and Trademark Office connected therewith and to file, to prosecute and to 
transact all business in connection with all patent applications directed to the invention: 



Lisa E. Alexander, Reg. No. 41,576 
Robert P. Blackburn, Reg. No. 30,447 
Anne S. Dollard, Reg. No. 43,935 
Joseph H. Guth, Reg. No. 31,261 
Alisa A. Harbin, Reg. No. 33,895 
Charlene A. Launer, Reg. No. 33,035 
David P. Lentini, Reg. No. 33,944 
Kimberlin L. Morley, Reg. No. 35,391 
Roberta L. Robins, Reg. No. 33,208 
Dahna S. Pasternak, Reg. No. 41,41 1 
Vandana Date, Reg. No. 38,675 
Q Gary R. Fabian, Ph.D., Reg. No. 33,875 

f: Address all correspondence to: Anne S. Dollard, Esq. at 

}}J CHIRON CORPORATION 

jy. Intellectual Property - R440 

m P.O. Box 8097 

I B Emeryville, CA 94662-8097. 

FU Address all telephone calls to: Anne S. Dollard, Esq. at 510-923-2719. 



This appointment, including the right to delegate this appointment, shall also apply to the same 
extent to any proceedings established by the Patent Cooperation Treaty. 

I hereby declare that all statements made herein of my own knowledge are true and that all 
statements made on information and belief are believed to be true; and further that these statements 
were made with the knowledge that willful false statements and the like so made are punishable by 
fine or imprisonment, or both, under § 1001 of Title 18 of the United States Code and that such 
willful false statements may jeopardize the validity of the application or any patent issued thereon. 

Signature: Date 

Full Name of Inventor: Susan BARNETT 

Citizenship: US 

Residence: San Francisco, CA 

Post Office Address: c/o Chiron Corporation, P.O. Box 8097, Emeryville, CA 94662-8097 

Signature: Date 

Full Name of Inventor: Jan ZUR MEGEDE 
Citizenship: Germany 
Residence: San Francisco, CA 

Post Office Address: c/o Chiron Corporation, P.O. Box 8097, Emeryville, CA 94662-8097 
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Signature: „ mie 

Full Name of Inventor: Indresh SRIVASTAVA 
Citizenship: India 
Residence: Benicia, CA 

Post Office Address: c/o Chiron Corporation, P.O. Box 8097, Emeryville, CA 94662-8097 



Signature: U3xe 

Full Name of Inventor: Ying LIAN 
Citizenship: China 
Residence: Albany, CA 

Post Office Address: c/o Chiron Corporation, P.O. Box 8097, Emeryville, CA 94662-8097 



Signature: . Date 

Full Name of Inventor: Karin HARTOG 
Citizenship: Chile 
Residence: Piedmont, CA 

Post Office Address: c/o Chiron Corporation, P.O. Box 8097, Emeryville, CA 94662-8097 

Signature: .. . Date 

Full Name of Inventor: Hong LIU 
Citizenship: China 
Residence: Castro Valley, CA 

Post Office Address: c/o Chiron Corporation, P.O. Box 8097, Emeryville, CA 94662-8097 

Signature: „ Date 

Full Name of Inventor: Catherine GREER 
Citizenship: US 
Residence: Oakland, CA 

Post Office Address: c/o Chiron Corporation, P.O. Box 8097, Emeryville, CA 94662-8097 

Signature: Date 

Full Name of Inventor: Mark SELBY 
Citizenship: US 
Residence: Berkeley, CA 

Post Office Address: c/o Chiron Corporation, P.O. Box 8097, Emeryville, CA 94662-8097 



Signature: . JJate 

Full Name of Inventor: Christopher WALKER 
Citizenship: US 
Residence: Columbus, OH 

Post Office Address: c/o Chiron Corporation, P.O. Box 8097, Emeryville, CA 94662-8097 



SEQUENCE LISTING 



<110> BARNETT, Susan 
ZUR MEGEDE, Jan 
SRIVASTAVA, Indresh 
LIAN, Ying 
HARTOG, Karin 
LIU, Hong 
GREER, Catherine 
SELBY, Mark 

WALKER, Christopher t 

<120> IMPROVED EXPRESSION OF HIV POLYPEPTIDES AND PRODUCTION 
OF VIRUS -LIKE PARTICLES 

<130> 1621.002 

<140> 
<141> 

<160> 90 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 1509 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 1 

atgggtgcga gagcgtcggt attaagcggg ggagaattag ataaatggga aaaaattcgg 60 
ttaaggccag ggggaaagaa aaaatataag ttaaaacata tagtatgggc aagcagggag 120 
ctagaacgat tcgcagtcaa tcctggcctg ttagaaacat cagaaggctg cagacaaata 18 0 
ttgggacagc tacagccatc ccttcagaca ggatcagaag aacttagatc attatataat 24 0 
acagtagcaa ccctctattg tgtacatcaa aggatagatg taaaagacac caaggaagct 300 
ttagagaaga tagaggaaga gcaaaacaaa agtaagaaaa aggcacagca agcagcagct 360 
gcagctggca caggaaacag cagccaggtc agccaaaatt accctatagt gcagaaccta 420 
caggggcaaa tggtacatca ggccatatca cctagaactt taaatgcatg ggtaaaagta 480 
gtagaagaaa aggctttcag cccagaagta atacccatgt tttcagcatt atcagaagga 540 
gccaccccac aagatttaaa caccatgcta aacacagtgg ggggacatca agcagccatg 600 
caaatgttaa aagagactat caatgaggaa gctgcagaat gggatagagt gcatccagtg 660 
catgcagggc ctattgcacc aggccaaatg agagaaccaa ggggaagtga catagcagga 72 0 
actactagta cccttcagga acaaatagga tggatgacaa ataatccacc tatcccagta 7 80 
ggagaaatct ataaaagatg gataatcctg ggattaaata aaatagtaag aatgtatagc 840 
cctaccagca ttctggacat aagacaagga ccaaaggaac cctttagaga ttatgtagac 900 
cggttctata aaactctaag agccgaacaa gcttcacagg atgtaaaaaa ttggatgaca 960 
gaaaccttgt tggtccaaaa tgcaaaccca gattgtaaga ctattttaaa agcattggga 1020 
ccagcagcta cactagaaga aatgatgaca gcatgtcagg gagtgggggg acccggccat 108 0 
aaagcaagag ttttggctga agccatgagc caagtaacaa atccagctaa cataatgatg 114 0 
cagagaggca attttaggaa ccaaagaaag actgttaagt gtttcaattg tggcaaagaa 1200 
gggcacatag ccaaaaattg cagggcccct aggaaaaagg gctgttggag atgtggaagg 1260 
gaaggacacc aaatgaaaga ttgcactgag agacaggcta attttttagg gaagatctgg 132 0 
ccttcctaca agggaaggcc agggaatttt cttcagagca gaccagagcc aacagcccca 13 80 
ccagaagaga gcttcaggtt tggggaggag aaaacaactc cctctcagaa gcaggagccg 1440 



1 



atagacaagg aactgtatcc tttaacttcc ctcagatcac tctttggcaa cgacccctcg 1500 
tcacaataa 1509 



<210> 2 
<211> 1845 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 2 

atgggtgcga gagcgtcggt attaagcggg ggagaattag ataaatggga aaaaattcgg 60 
ttaaggccag ggggaaagaa aaaatataag ttaaaacata tagtatgggc aagcagggag 12 0, 
ctagaacgat tcgcagtcaa tcctggcctg ttagaaacat cagaaggctg cagacaaata 180 
ttgggacagc tacagccatc ccttcagaca ggatcagaag aacttagatc attatataat 240 
acagtagcaa ccctctattg tgtacatcaa aggatagatg taaaagacac caaggaagct 300 
ttagagaaga tagaggaaga gcaaaacaaa agtaagaaaa aggcacagca agcagcagct 3 60 
gcagctggca caggaaacag cagccaggtc agccaaaatt accctatagt gcagaaccta 42 0 
caggggcaaa tggtacatca ggccatatca cctagaactt taaatgcatg ggtaaaagta 480 
gtagaagaaa aggctttcag cccagaagta atacccatgt tttcagcatt atcagaagga 540 
gccaccccac aagatttaaa caccatgcta aacacagtgg ggggacatca agcagccatg 600 
caaatgttaa aagagactat caatgaggaa gctgcagaat gggatagagt gcatccagtg 660 
catgcagggc ctattgcacc aggccaaatg agagaaccaa ggggaagtga catagcagga 72 0 
actactagta cccttcagga acaaatagga tggatgacaa ataatccacc tatcccagta 780 
ggagaaatct ataaaagatg gataatcctg ggattaaata aaatagtaag aatgtatagc 840 
cctaccagca ttctggacat aagacaagga ccaaaggaac cctttagaga ttatgtagac 900 
cggttctata aaactctaag agccgaacaa gcttcacagg atgtaaaaaa ttggatgaca 960 
gaaaccttgt tggtccaaaa tgcaaaccca gattgtaaga ctattttaaa agcattggga 102 0 
ccagcagcta cactagaaga aatgatgaca gcatgtcagg gagtgggggg acccggccat 1080 
aaagcaagag ttttggctga agccatgagc caagtaacaa atccagctaa cataatgatg 114 0 
cagagaggca attttaggaa ccaaagaaag actgttaagt gtttcaattg tggcaaagaa 12 00 
gggcacatag ccaaaaattg cagggcccct aggaaaaagg gctgttggag atgtggaagg 1260 
gaaggacacc aaatgaaaga ttgcactgag agacaggcta attttttagg gaagatctgg 1320 
ccttcctaca agggaaggcc agggaatttt cttcagagca gaccagagcc aacagcccca 13 80 
ccagaagaga gcttcaggtt tggggaggag aaaacaactc cctctcagaa gcaggagccg 144 0 
-atagacaagg aactgtatcc tttaacttcc ctcagatcac tctttggcaa cgacccctcg 1500 
tcacaataag gatagggggg caactaaagg aagctctatt agatacagga gcagatgata 1560 
cagtattaga agaaatgaat ttgccaggaa aatggaaacc aaaaatgata gggggaattg 1620 
gaggttttat caaagtaaga cagtacgatc agatacctgt agaaatctgt ggacataaag 1680 
ctataggtac agtattagta ggacctacac ctgtcaacat aattggaaga aatctgttga 1740 
ctcagattgg ttgtacttta aatttcccca ttagtcctat tgaaactgta ccagtaaaat 1800 
taaagccagg aatggatggc ccaaaagtta agcaatggcc attga 1845 

<210> 3 
<211> 4313 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 3 

atgggtgcga gagcgtcggt attaagcggg ggagaattag ataaatggga aaaaattcgg 60 
ttaaggccag ggggaaagaa aaaatataag ttaaaacata tagtatgggc aagcagggag 12 0 
ctagaacgat tcgcagtcaa tcctggcctg ttagaaacat cagaaggctg cagacaaata 180 
ttgggacagc tacagccatc ccttcagaca ggatcagaag aacttagatc attatataat 240 
acagtagcaa ccctctattg tgtacatcaa aggatagatg taaaagacac caaggaagct 3 00 
ttagagaaga tagaggaaga gcaaaacaaa agtaagaaaa aggcacagca agcagcagct 3 60 
gcagctggca caggaaacag cagccaggtc agccaaaatt accctatagt gcagaaccta 42 0 
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caggggcaaa tggtacatca ggccatatca 
gtagaagaaa aggctttcag cccagaagta 
gccaccccac aagatttaaa caccatgcta 
caaatgttaa aagagactat caatgaggaa 
catgcagggc ctattgcacc aggccaaatg 
actactagta cccttcagga acaaatagga 
ggagaaatct ataaaagatg gataatcctg 
cctaccagca ttctggacat aagacaagga 
cggttctata aaactctaag agccgaacaa 
gaaaccttgt tggtccaaaa tgcaaaccca 
ccagcagcta cactagaaga aatgatgaca 
aaagcaagag ttttggctga agccatgagc 
cagagaggca attttaggaa ccaaagaaag 
gggcacatag ccaaaaattg cagggcccct 
gaaggacacc aaatgaaaga ttgcactgag 
ccttcctaca agggaaggcc agggaatttt 
ccagaagaga gcttcaggtt tggggaggag 
atagacaagg aactgtatcc tttaacttcc 
tcacaataag gatagggggg caactaaagg 
cagtattaga agaaatgaat ttgccaggaa 
gaggttttat caaagtaaga cagtacgatc 
ctataggtac agtattagta ggacctacac 
ctcagattgg ttgtacttta aatttcccca 
taaagccagg aatggatggc ccaaaagtta 
aagcattagt agagatatgt acagaaatgg 
ctgaaaatcc atacaatact ccagtatttg 
gaaaactagt agatttcaga gaacttaata 
taggaatacc acaccccgca gggttaaaaa 
gtgatgcata cttttcagtt cccttagata 
tacctagtat aaacaatgag acaccaggga 
gatggaaagg atcaccagca atattccaaa 
gaaaacagaa tccagacata gttatctatc 
acttagaaat agggcagcat agaacaaaaa 
ggggatttac cacaccagac aaaaaacatc 
atgaactcca tcctgataaa tggacagtac 
ggactgtcaa tgacatacag aagttagtgg 
cagggattaa agtaaagcag ttatgtaaac 
taataccact aacagaagaa gcagagctag 
aaccagtaca tgaagtatat tatgacccat 
aggggcaagg ccaatggaca tatcaaattt 
gaaagtatgc aaggatgagg ggtgcccaca 
tgcaaaaagt atccacagaa agcatagtaa 
ccatacaaaa ggaaacatgg gaagcatggt 
ctgagtggga gtttgtcaat acccctccct 
aacccatagt aggagcagaa actttctatg 
taggaaaagc aggatatgtt actgacagag 
caacaaatca gaagactgaa ttacaagcaa 
aagtaaacat agtaacagac tcacaatatg 
agagtgaatc agagttagtc agtcaaataa 
acctggcatg ggtaccagca cacaaaggaa 
tcagtgctgg aatcaggaaa gtactatttt 
atgagaaata tcacagtaat tggagagcaa 
tagcaaaaga aatagtagcc agctgtgata 
gacaagtaga ctgtagtcca ggaatatggc 



cctagaactt taaatgcatg ggtaaaagta 480 
atacccatgt tttcagcatt atcagaagga 540 
aacacagtgg ggggacatca agcagccatg 600 
gctgcagaat gggatagagt gcatccagtg 660 
agagaaccaa ggggaagtga catagcagga 720 
tggatgacaa ataatccacc tatcccagta 78 0 
ggattaaata aaatagtaag aatgtatagc 84 0 
ccaaaggaac cctttagaga ttatgtagac 900 
gcttcacagg atgtaaaaaa ttggatgaca 960 
gattgtaaga ctattttaaa agcattggga 102 0 
gcatgtcagg gagtgggggg acccggccat 1080 
caagtaacaa atccagctaa cataatgatg 1140 
actgttaagt gtttcaattg tggcaaagaa 1200 
aggaaaaagg gctgttggag atgtggaagg 1260 
agacaggcta attttttagg gaagatctgg 132 0 
cttcagagca gaccagagcc aacagcccca 13 80 
aaaacaactc cctctcagaa gcaggagccg 1440 
ctcagatcac tctttggcaa cgacccctcg 1500 
aagctctatt agatacagga gcagatgata 1560 
aatggaaacc aaaaatgata gggggaattg 1620 
agatacctgt agaaatctgt ggacataaag 1680 
ctgtcaacat aattggaaga aatctgttga 1740 
ttagtcctat tgaaactgta ccagtaaaat 1800 
agcaatggcc attgacagaa gaaaaaataa 1860 
aaaaggaagg gaaaatttca aaaattgggc 192 0 
ctataaagaa aaaagacagt actaaatgga 1980 
aaagaactca agacttctgg gaagttcagt 2 040 
agaaaaaatc agtaacagta ttggatgtgg 2100 
aagactttag aaagtatact gcatttacca 2160 
ttagatatca gtacaatgtg ctgccacagg 2220 
gtagcatgac aaaaatctta gagcctttta 2280 
aatacatgga tgatttgtat gtaggatctg 234 0 
tagaggaact gagacagcat ctgttgaggt 2400 
agaaagaacc tccattcctt tggatgggtt 2460 
agcctataat gctgccagaa aaagacagct 252 0 
gaaaattgaa ttgggcaagt cagatttatg 2580 
tccttagagg aaccaaagca ctaacagaag 2640 
aactggcaga aaacagggag attctaaaag 2700 
caaaagactt agtagcagaa atacagaagc 2760 
atcaagagcc atttaaaaat ctgaaaacag 2820 
ctaatgatgt aaaacagtta acagaggcag 2880 
tatggggaaa gattcctaaa tttaaactac 2940 
ggatggagta ttggcaagct acctggattc 3 000 
tagtgaaatt atggtaccag ttagagaaag 3 060 
tagatggggc agctaatagg gagactaaat 312 0 
gaagacaaaa agttgtctcc atagctgaca 318 0 
ttcatctagc tttgcaggat tcgggattag 324 0 
cattaggaat cattcaagca caaccagata 3 3 00 
tagagcagtt aataaaaaag gaaaaggtct 3360 
ttggaggaaa tgaacaagta gataaattag 342 0 
tgaatggaat agataaggcc caagaagaac 3480 
tggctagtga ttttaacctg ccacctgtag 3540 
aatgtcagct aaaaggagaa gccatgcatg 3600 
aactagattg tacacatcta gaaggaaaaa 3660 



ttatcctggt agcagttcat gtagccagtg gatatataga agcagaagtt attccagcag 372 0 
agacagggca ggaaacagca tattttctct taaaattagc aggaagatgg ccagtaaaaa 3780 
caatacatac agacaatggc agcaatttca ccagtactac ggttaaggcc gcctgttggt 3 840 
gggcagggat caagcaggaa tttggcattc cctacaatcc ccaaagtcaa ggagtagtag 3900 
aatctatgaa taatgaatta aagaaaatta taggacaggt aagagatcag gctgaacacc 3 960 
ttaagacagc agtacaaatg gcagtattca tccacaattt taaaagaaaa ggggggattg 402 0 
ggggatacag tgcaggggaa agaatagtag acataatagc aacagacata caaactaaag 4080 
aactacaaaa gcaaattaca aaaattcaaa attttcgggt ttattacagg gacaacaaag 4140 
atcccctttg gaaaggacca gcaaagcttc tctggaaagg tgaaggggca gtagtaatac 4200 
aagataatag tgacataaaa gtagtgccaa gaagaaaagc aaaaatcatt agggattatg 4260 
gaaaacagat ggcaggtgat gattgtgtgg caagtagaca ggatgaggat tag 4313 

<210> 4 
<211> 1515 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
HIV-Gag 

<400> 4 

gccaccatgg gcgcccgcgc cagcgtgctg agcggcggcg agctggacaa gtgggagaag 60 
atccgcctgc gccccggcgg caagaagaag tacaagctga agcacatcgt gtgggccagc 12 0 
cgcgagctgg agcgcttcgc cgtgaacccc ggcctgctgg agaccagcga gggctgccgc 180 
cagatcctgg gccagctgca gcccagcctg cagaccggca gcgaggagct gcgcagcctg 240 
tacaacaccg tggccaccct gtactgcgtg caccagcgca tcgacgtcaa ggacaccaag 3 00 
gaggccctgg agaagatcga ggaggagcag aacaagtcca agaagaaggc ccagcaggcc 3 60 
gccgccgccg ccggcaccgg caacagcagc caggtgagcc agaactaccc catcgtgcag 42 0 
aacctgcagg gccagatggt gcaccaggcc atcagccccc gcaccctgaa cgcctgggtg 480 
aaggtggtgg aggagaaggc cttcagcccc gaggtgatcc ccatgttcag cgccctgagc 540 
gagggcgcca ccccccagga cctgaacacg atgttgaaca ccgtgggcgg ccaccaggcc 600 
gccatgcaga tgctgaagga gaccatcaac gaggaggccg ccgagtggga ccgcgtgcac 660 
cccgtgcacg ccggccccat cgcccccggc cagatgcgcg agccccgcgg cagcgacatc 72 0 
gccggcacca ccagcaccct gcaggagcag atcggctgga tgaccaacaa cccccccatc 780 
cccgtgggcg agate tacaa gcggtggatc atcctgggcc tgaacaagat cgtgcggatg 84 0 
tacagcccca ccagcatcct ggacatccgc cagggcccca aggagecett ccgcgactac 900 
gtggaccgct tctacaagac cctgcgcgct gagcaggeca gecaggaegt gaagaactgg 960 
atgaccgaga ccctgctggt gcagaacgcc aaccccgact gcaagaccat cctgaaggct 1020 
ctcggccccg cggccaccct ggaggagatg atgaccgcct gecagggegt gggcggcccc 1080 
ggecacaagg cccgcgtgct ggccgaggcg atgagecagg tgacgaaccc ggcgaccatc 1140 
atgatgeage geggcaaett ccgcaaccag eggaagaccg teaagtgett caactgcggc 12 00 
aaggagggee acaccgccag gaactgccgc gccccccgca agaagggctg ctggcgctgc 1260 
ggecgegagg gccaccagat gaaggactgc accgagcgcc aggecaaett cctgggcaag 13 20 
atctggccca gctacaaggg ccgccccggc aacttcctgc agagccgccc cgagcccacc 13 8 0 
gccccccccg aggagagctt ccgcttcggc gaggagaaga ccacccccag ccagaagcag 144 0 
gagcccatcg acaaggagct gtaccccctg accagcctgc gcagcctgtt cggcaacgac 150 0 
cccagcagcc agtaa 1515 

<210> 5 
<211> 1853 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: synthetic 
HIV-Gag-protease 

<400> 5 

gccaccatgg gcgcccgcgc cagcgtgctg agcggcggcg agctggacaa gtgggagaag 60 
atccgcctgc gccccggcgg caagaagaag tacaagctga agcacatcgt gtgggccagc 12 0 
cgcgagctgg agcgcttcgc cgtgaacccc ggcctgctgg agaccagcga gggctgccgc 180 
cagatcctgg gccagctgca gcccagcctg cagaccggca gcgaggagct gcgcagcctg 24 0 
tacaacaccg tggccaccct gtactgcgtg caccagcgca tcgacgtcaa ggacaccaag 300 
gaggccctgg agaagatcga ggaggagcag aacaagtcca agaagaaggc ccagcaggcc 360 t 
gccgccgccg ccggcaccgg caacagcagc caggtgagcc agaactaccc catcgtgcag 42 0 
aacctgcagg gccagatggt gcaccaggcc atcagccccc gcaccctgaa cgcctgggtg 4 80 
aaggtggtgg aggagaaggc cttcagcccc gaggtgatcc ccatgttcag cgccctgagc 540 
gagggcgcca ccccccagga cctgaacacg atgttgaaca ccgtgggcgg ccaccaggcc 600 
gccatgcaga tgctgaagga gaccatcaac gaggaggccg ccgagtggga ccgcgtgcac 660 
cccgtgcacg ccggccccat cgcccccggc cagatgcgcg agccccgcgg cagcgacatc 72 0 
gccggcacca ccagcaccct gcaggagcag atcggctgga tgaccaacaa cccccccatc 780 
cccgtgggcg agatctacaa gcggtggatc atcctgggcc tgaacaagat cgtgcggatg 840 
tacagcccca ccagcatcct ggacatccgc cagggcccca aggagccctt ccgcgactac 900 
gtggaccgct tctacaagac cctgcgcgct gagcaggcca gccaggacgt gaagaactgg 960 
atgaccgaga ccctgctggt gcagaacgcc aaccccgact gcaagaccat cctgaaggct 102 0 
ctcggccccg cggccaccct ggaggagatg atgaccgcct gccagggcgt gggcggcccc 1080 
ggccacaagg cccgcgtgct ggccgaggcg atgagccagg tgacgaaccc ggcgaccatc 1140 
atgatgcagc gcggcaactt ccgcaaccag cggaagaccg tcaagtgctt caactgcggc 12 0 0 
aaggagggcc acaccgccag gaactgccgc gccccccgca agaagggctg ctggcgctgc 1260 
ggccgcgaag gacaccaaat gaaagattgc actgagagac aggctaattt tttagggaag 1320 
atctggcctt cctacaaggg aaggccaggg aattttcttc agagcagacc agagccaaca 13 8 0 
gccccaccag aagagagctt caggtttggg gaggagaaaa caactccctc tcagaagcag 1440 
gagccgatag acaaggaact gtatccttta acttccctca gatcactctt tggcaacgac 1500 
ccctcgtcac agtaaggatc ggcggccagc tcaaggaggc gctgctcgac accggcgccg 1560 
acgacaccgt gctggaggag atgaacctgc ccggcaagtg gaagcccaag atgatcggcg 162 0 
ggatcggggg cttcatcaag gtgcggcagt acgaccagat ccccgtggag atctgcggcc 1680 
acaaggccat cggcaccgtg ctggtgggcc ccacccccgt gaacatcatc ggccgcaacc 1740 
tgctgaccca gatcggctgc accctgaact tccccatcag ccccatcgag acggtgcccg 1800 
tgaagctgaa gccggggatg gacggcccca aggtcaagca gtggcccctg taa 1853 

<210> 6 
<211> 4319 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
HlV-Gag-polymerase 

<400> 6 

gccaccatgg gcgcccgcgc cagcgtgctg agcggcggcg agctggacaa gtgggagaag 60 
atccgcctgc gccccggcgg caagaagaag tacaagctga agcacatcgt gtgggccagc 12 0 
cgcgagctgg agcgcttcgc cgtgaacccc ggcctgctgg agaccagcga gggctgccgc 180 
cagatcctgg gccagctgca gcccagcctg cagaccggca gcgaggagct gcgcagcctg 240 
tacaacaccg tggccaccct gtactgcgtg caccagcgca tcgacgtcaa ggacaccaag 3 00 
gaggccctgg agaagatcga ggaggagcag aacaagtcca agaagaaggc ccagcaggcc 360 
gccgccgccg ccggcaccgg caacagcagc caggtgagcc agaactaccc catcgtgcag 42 0 
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aacctgcagg gccagatggt gcaccaggcc 
aaggtggtgg aggagaaggc cttcagcccc 
gagggcgcca ccccccagga cctgaacacg 
gccatgcaga tgctgaagga gaccatcaac 
cccgtgcacg ccggccccat cgcccccggc 
gccggcacca ccagcaccct gcaggagcag 
cccgtgggcg agatctacaa gcggtggatc 
tacagcccca ccagcatcct ggacatccgc 
gtggaccgct tctacaagac cctgcgcgct 
atgaccgaga ccctgctggt gcagaacgcc 
ctcggccccg cggccaccct ggaggagatg 
ggccacaagg cccgcgtgct ggccgaggcg 
atgatgcagc gcggcaactt ccgcaaccag 
aaggagggcc acaccgccag gaactgccgc 
ggccgcgaag gacaccaaat gaaagattgc 
atctggcctt cctacaaggg aaggccaggg 
gccccaccag aagagagctt caggtttggg 
gagccgatag acaaggaact gtatccttta 
ccctcgtcac agtaaggatc ggcggccagc 
acgacaccgt gctggaggag atgaacctgc 
ggatcggggg cttcatcaag gtgcggcagt 
acaaggccat cggcaccgtg ctggtgggcc 
tgctgaccca gatcggctgc accctgaact 
tgaagctgaa gccggggatg gacggcccca 
agatcaaggc cctggtggag atctgcaccg 
tcggccccga gaacccctac aacacccccg 
agtggcgcaa gctggtggac ttccgcgagc 
tgcagctggg catcccccac cccgccggcc 
acgtgggcga cgcctacttc agcgtgcccc 
tcaccatccc cagcatcaac aacgagaccc 
cccagggctg gaagggcagc cccgccatct 
ccttccgcaa gcagaacccc gacatcgtga 
gcagcgacct ggagatcggc cagcaccgca 
tgcgctgggg cttcaccacc cccgacaaga 
tgggctacga gctgcacccc gacaagtgga 
acagctggac cgtgaacgac atccagaagc 
tctacgccgg catcaaggtg aagcagctgt 
ccgaggtgat ccccctgacc gaggaggccg 
tgaaggagcc cgtgcacgag gtgtactacg 
agaagcaggg ccagggccag tggacctacc 
agaccggcaa gtacgcccgc atgcgcggcg 
aggccgtgca gaaggtgagc accgagagca 
agctgcccat ccagaaggag acctgggagg 
ggatccccga gtgggagttc gtgaacaccc 
agaaggagcc catcgtgggc gccgagacct 
ccaagctggg caaggccggc tacgtgaccg 
ccgacaccac caaccagaag accgagctgc 
gcctggaggt gaacatcgtg accgacagcc 
ccgacaagag cgagagcgag ctggtgagcc 
aggtgtacct ggcctgggtg cccgcccaca 
agctggtgag cgccggcatc cgcaaggtgc 
aggagcacga gaagtaccac agcaactggc 
ccgtggtggc caaggagatc gtggccagct 
tgcacggcca ggtggactgc agccccggca 



atcagccccc gcaccctgaa cgcctgggtg 480 
gaggtgatcc ccatgttcag cgccctgagc 54 0 
atgttgaaca ccgtgggcgg ccaccaggcc 600 
gaggaggccg ccgagtggga ccgcgtgcac 660 
cagatgcgcg agccccgcgg cagcgacatc 720 
atcggctgga tgaccaacaa cccccccatc 780 
atcctgggcc tgaacaagat cgtgcggatg 840 
cagggcccca aggagccctt ccgcgactac 900 
gagcaggcca gccaggacgt gaagaactgg 960 
aaccccgact gcaagaccat cctgaaggct 1020 
atgaccgcct gccagggcgt gggcggcccc 10 8 Q 
atgagccagg tgacgaaccc ggcgaccatc 1140 
cggaagaccg tcaagtgctt caactgcggc 12 00 
gccccccgca agaagggctg ctggcgctgc 12 60 
actgagagac aggctaattt tttagggaag 1320 
aattttcttc agagcagacc agagccaaca 13 80 
gaggagaaaa caactccctc tcagaagcag 1440 
acttccctca gatcactctt tggcaacgac 1500 
tcaaggaggc gctgctcgac accggcgccg 1560 
ccggcaagtg gaagcccaag atgatcggcg 1620 
acgaccagat ccccgtggag atctgcggcc 168 0 
ccacccccgt gaacatcatc ggccgcaacc 174 0 
tccccatcag ccccatcgag acggtgcccg 1800 
aggtcaagca gtggcccctg accgaggaga 1860 
agatggagaa ggagggcaag atcagcaaga 192 0 
tgttcgccat caagaagaag gacagcacca 1980 
tgaacaagcg cacccaggac ttctgggagg 2 040 
tgaagaagaa gaagagcgtg accgtgctgg 2100 
tggacaagga cttccgcaag tacaccgcct 2160 
ccggcatccg ctaccagtac aacgtgctgc 222 0 
tccagagcag catgaccaag atcctggagc 22 80 
tctaccagta catggacgac ctgtacgtgg 2340 
ccaagatcga ggagctgcgc cagcacctgc 2400 
agcaccagaa ggagcccccc ttcctgtgga 2460 
ccgtgcagcc catcatgctg cccgagaagg 2520 
tggtgggcaa gctgaactgg gccagccaga 2580 
gcaagctgct gcgcggcacc aaggccctga 2640 
agctggagct ggccgagaac cgcgagatcc 2700 
accccagcaa ggacctggtg gccgagatcc 2760 
agatctacca ggagcccttc aagaacctga 2 82 0 
cccacaccaa cgacgtgaag cagctgaccg 2 88 0 
tcgtgatctg gggcaagatc cccaagttca 2 94 0 
cctggtggat ggagtactgg caggccacct 3 000 
cccccctggt gaagctgtgg taccagctgg 3060 
tctacgtgga cggcgccgcc aaccgcgaga 312 0 
accgcggccg ccagaaggtg gtgagcatcg 3180 
aggccatcca cctggccctg caggacagcg 3240 
agtacgccct gggcatcatc caggcccagc 33 00 
agatcatcga gcagctgatc aagaaggaga 3360 
agggcatcgg cggcaacgag caggtggaca 342 0 
tgttcctgaa cggcatcgac aaggcccagg 34 80 
gcgccatggc cagcgacttc aacctgcccc 354 0 
gcgacaagtg ccagctgaag ggcgaggcca 3600 
tctggcagct ggactgcacc cacctggagg 3 660 



6 



gcaagatcat cctggtggcc gtgcacgtgg 
ccgccgagac cggccaggag accgcctact 
tgaagaccat ccacaccgac aacggcagca 
gc tggtgggc cggcatcaag caggagttcg 
tggtggagag catgaacaac gagctgaaga 
agcacctgaa gaccgccgtg cagatggccg 
gcatcggcgg ctacagcgcc ggcgagcgca 
ccaaggagct gcagaagcag atcaccaaga 
acaaggaccc cctgtggaag ggccccgcca 
tgatccagga caacagcgac atcaaggtgg 
actacggcaa gcagatggcc ggcgacgact 

<210> 7 
<211> 2031 
<212> DNA 

<213> Artificial Sequence 



ccagcggcta catcgaggcc gaggtgatcc 372 0 
tcctgctgaa gctggccggc cgctggcccg 3780 
acttcaccag caccaccgtg aaggccgcct 3 840 
gcatccccta caacccccag agccagggcg 3900 
agatcatcgg ccaggtgcgc gaccaggccg 3960 
tgttcatcca caacttcaag cgcaagggcg 4 02 0 
tcgtggacat catcgccacc gacatccaga 4 080 
tccagaactt ccgcgtgtac taccgcgaca 4140 
agctgctgtg gaagggcgag ggcgccgtgg 4200 
tgccccgccg caaggccaag atcatccgcg 426 0 
gcgtggccag ccgccaggac gaggactag 431? 



<220> 

<223> Description of Artificial Sequence: synthetic 
HIV-Gag/HCV-core fusion polypeptide 

<400> 7 

gccaccatgg gcgcccgcgc cagcgtgctg agcggcggcg agctggacaa gtgggagaag 60 
atccgcctgc gccccggcgg caagaagaag tacaagctga agcacatcgt gtgggccagc 12 0 
cgcgagctgg agcgcttcgc cgtgaacccc ggcctgctgg agaccagcga gggctgccgc 180 
cagatcctgg gccagctgca gcccagcctg cagaccggca gcgaggagct gcgcagcctg 24 0 
tacaacaccg tggccaccct gtactgcgtg caccagcgca tcgacgtcaa ggacaccaag 3 00 
gaggccctgg agaagatcga ggaggagcag aacaagtcca agaagaaggc ccagcaggcc 360 
gccgccgccg ccggcaccgg caacagcagc caggtgagcc agaactaccc catcgtgcag 42 0 
aacctgcagg gccagatggt gcaccaggcc atcagccccc gcaccctgaa cgcctgggtg 480 
aaggtggtgg aggagaaggc cttcagcccc gaggtgatcc ccatgttcag cgccctgagc 540 
gagggcgcca ccccccagga cctgaacacg atgttgaaca ccgtgggcgg ccaccaggcc 600 
gccatgcaga tgctgaagga gaccatcaac gaggaggccg ccgagtggga ccgcgtgcac 660 
cccgtgcacg ccggccccat cgcccccggc cagatgcgcg agccccgcgg cagcgacatc 720 
gccggcacca ccagcaccct gcaggagcag atcggctgga tgaccaacaa cccccccatc 7 80 
cccgtgggcg agatctacaa gcggtggatc atcctgggcc tgaacaagat cgtgcggatg 840 
tacagcccca ccagcatcct ggacatccgc cagggcccca aggagccctt ccgcgactac 900 
gtggaccgct tctacaagac cctgcgcgct gagcaggcca gccaggacgt gaagaactgg 960 
atgaccgaga ccctgctggt gcagaacgcc aaccccgact gcaagaccat cctgaaggct 102 0 
ctcggccccg cggccaccct ggaggagatg atgaccgcct gccagggcgt gggcggcccc 1080 
ggccacaagg cccgcgtgct ggccgaggcg atgagccagg tgacgaaccc ggcgaccatc 1140 
atgatgcagc gcggcaactt ccgcaaccag cggaagaccg tcaagtgctt caactgcggc 1200 
aaggagggcc acaccgccag gaactgccgc gccccccgca agaagggctg ctggcgctgc 1260 
ggccgcgagg gccaccagat gaaggactgc accgagcgcc aggccaactt cctgggcaag 132 0 
atctggccca gctacaaggg ccgccccggc aacttcctgc agagccgccc cgagcccacc 13 8 0 
gccccccccg aggagagctt ccgcttcggc gaggagaaga ccacccccag ccagaagcag 144 0 
gagcccatcg acaaggagct gtaccccctg accagcctgc gcagcctgtt cggcaacgac 1500 
cccagcagcc agtcgacgaa tcctaaacct caaagaaaaa acaaacgtaa caccaaccgt 1560 
cgcccacagg acgtcaagtt cccgggtggc ggtcagatcg ttggtggagt ttacttgttg 162 0 
ccgcgcaggg gccctagatt gggtgtgcgc gcgacgagaa agacttccga gcggtcgcaa 1680 
cctcgaggta gacgtcagcc tatccccaag gctcgtcggc ccgagggcag gacctgggct 174 0 
cagcccgggt acccttggcc cctctatggc aatgagggct gcgggtgggc gggatggctc 1800 
ctgtctcccc gtggctctcg gcctagctgg ggccccacag acccccggcg taggtcgcgc 1860 
aatttgggta aggtcatcga tacccttacg tgcggcttcg ccgacctcat ggggtacata 192 0 
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ccgctcgtcg gcgcccctct tggaggcgct gccagggccc tggcgcatgg cgtccgggtt 1980 
ctggaagacg gcgtgaacta tgcaacaggg aaccttcctg gttgctctta g 2031 

<210> 8 
<211> 2025 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 

HIV-Gag/HCV-Core fusion polypeptide r 

<400> 8 

atgggtgcga gagcgtcggt attaagcggg ggagaattag ataaatggga aaaaattcgg 60 
ttaaggccag ggggaaagaa aaaatataag ttaaaacata tagtatgggc aagcagggag 120 
ctagaacgat tcgcagtcaa tcctggcctg ttagaaacat cagaaggctg cagacaaata 18 0 
ttgggacagc tacagccatc ccttcagaca ggatcagaag aacttagatc attatataat 24 0 
acagtagcaa ccctctattg tgtacatcaa aggatagatg taaaagacac caaggaagct 300 
ttagagaaga tagaggaaga gcaaaacaaa agtaagaaaa aggcacagca agcagcagct 360 
gcagctggca caggaaacag cagccaggtc agccaaaatt accctatagt gcagaaccta 42 0 
caggggcaaa tggtacatca ggccatatca cctagaactt taaatgcatg ggtaaaagta 4 80 
gtagaagaaa aggctttcag cccagaagta atacccatgt tttcagcatt atcagaagga 540 
gccaccccac aagatttaaa caccatgcta aacacagtgg ggggacatca agcagccatg 600 
caaatgttaa aagagactat caatgaggaa gctgcagaat gggatagagt gcatccagtg 660 
catgcagggc ctattgcacc aggccaaatg agagaaccaa ggggaagtga catagcagga 72 0 
actactagta cccttcagga acaaatagga tggatgacaa ataatccacc tatcccagta 78 0 
ggagaaatct ataaaagatg gataatcctg ggattaaata aaatagtaag aatgtatagc 84 0 
cctaccagca ttctggacat aagacaagga ccaaaggaac cctttagaga ttatgtagac 900 
cggttctata aaactctaag agccgaacaa gcttcacagg atgtaaaaaa ttggatgaca 960 
gaaaccttgt tggtccaaaa tgcaaaccca gattgtaaga ctattttaaa agcattggga 102 0 
ccagcagcta cactagaaga aatgatgaca gcatgtcagg gagtgggggg acccggccat 10 80 
aaagcaagag ttttggctga agccatgagc caagtaacaa atccagctaa cataatgatg 114 0 
cagagaggca attttaggaa ccaaagaaag actgttaagt gtttcaattg tggcaaagaa 12 00 
gggcacatag ccaaaaattg cagggcccct aggaaaaagg gctgttggag atgtggaagg 12 60 
gaaggacacc aaatgaaaga ttgcactgag agacaggcta attttttagg gaagatctgg 1320 
ccttcctaca agggaaggcc agggaatttt cttcagagca gaccagagcc aacagcccca 13 80 
ccagaagaga gcttcaggtt tggggaggag aaaacaactc cctctcagaa gcaggagccg 1440 
atagacaagg aactgtatcc tttaacttcc ctcagatcac tctttggcaa cgacccctcg 1500 
tcacagtcga cgaatcctaa acctcaaaga aaaaacaaac gtaacaccaa ccgtcgccca 1560 
caggacgtca agttcccggg tggcggtcag atcgttggtg gagtttactt gttgccgcgc 162 0 
aggggcccta gattgggtgt gcgcgcgacg agaaagactt ccgagcggtc gcaacctcga 168 0 
ggtagacgtc agcctatccc caaggctcgt cggcccgagg gcaggacctg ggctcagccc 1740 
gggtaccctt ggcccctcta tggcaatgag ggctgcgggt gggcgggatg gctcctgtct 18 00 
ccccgtggct ctcggcctag ctggggcccc acagaccccc ggcgtaggtc gcgcaatttg 1860 
ggtaaggtca tcgataccct tacgtgcggc ttcgccgacc tcatggggta cataccgctc 192 0 
gtcggcgccc ctcttggagg cgctgccagg gccctggcgc atggcgtccg ggttctggaa 1980 
gacggcgtga actatgcaac agggaacctt cctggttgct cttag 2 025 

<210> 9 

<211> 1268 

<212> DNA 

<213> Artificial Sequence 
<220> 



8 



<223> Description of Artificial Sequence: synthetic Gag 
common region 

<400> 9 

gccaccatgg gcgcccgcgc cagcgtgctg agcggcggcg agctggacaa gtgggagaag 60 
atccgcctgc gccccggcgg caagaagaag tacaagctga agcacatcgt gtgggccagc 12 0 
cgcgagctgg agcgcttcgc cgtgaacccc ggcctgctgg agaccagcga gggctgccgc 180 
cagatcctgg gccagctgca gcccagcctg cagaccggca gcgaggagct gcgcagcctg 240 
tacaacaccg tggccaccct gtactgcgtg caccagcgca tcgacgtcaa ggacaccaag 3 00 
gaggccctgg agaagatcga ggaggagcag aacaagtcca agaagaaggc ccagcaggcc 360 
gccgccgccg ccggcaccgg caacagcagc caggtgagcc agaactaccc catcgtgcag 42 0 r 
aacctgcagg gccagatggt gcaccaggcc atcagccccc gcaccctgaa cgcctgggtg 480 
aaggtggtgg aggagaaggc cttcagcccc gaggtgatcc ccatgttcag cgccctgagc 54 0 
gagggcgcca ccccccagga cctgaacacg atgttgaaca ccgtgggcgg ccaccaggcc 600 
gccatgcaga tgctgaagga gaccatcaac gaggaggccg ccgagtggga ccgcgtgcac 660 
cccgtgcacg ccggccccat cgcccccggc cagatgcgcg agccccgcgg cagcgacatc 72 0 
gccggcacca ccagcaccct gcaggagcag atcggctgga tgaccaacaa cccccccatc 780 
cccgtgggcg agatctacaa gcggtggatc atcctgggcc tgaacaagat cgtgcggatg 84 0 
tacagcccca ccagcatcct ggacatccgc cagggcccca aggagccctt ccgcgactac 90 0 
gtggaccgct tctacaagac cctgcgcgct gagcaggcca gccaggacgt gaagaactgg 960 
atgaccgaga ccctgctggt gcagaacgcc aaccccgact gcaagaccat cctgaaggct 1020 
ctcggccccg cggccaccct ggaggagatg atgaccgcct gccagggcgt gggcggcccc 1080 
ggccacaagg cccgcgtgct ggccgaggcg atgagccagg tgacgaaccc ggcgaccatc 1140 
atgatgcagc gcggcaactt ccgcaaccag cggaagaccg tcaagtgctt caactgcggc 12 0 0 
aaggagggcc acaccgccag gaactgccgc gccccccgca agaagggctg ctggcgctgc 1260 
ggccgcga 1268 

<210> 10 

<211> 20 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: HIV-Gag 
peptide p7G 

<400> 10 

Gly Gly His Gin Ala Ala Met Gin Met Leu Lys Glu Thr lie Asn Glu 
15 10 15 

Glu Ala Ala Glu 
20 



<210> 11 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer GAGS 



<400> 11 

aagaattcca tgggtgcgag agcgtcggta 



30 



<210> 12 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
p55-SAL3 

<400> 12 

attcgtcgac tgtgacgagg ggtcgttgcc 

<210> 13 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
C0RESAL5 

<400> 13 

atttgtcgac gaatcctaaa cctcaaagaa aaac 

<210> 14 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 173CORE 
<400> 14 

tattggatcc taagagcaac caggaaggtt c 

<210> 15 
<211> 21 
<212> DNA 

<213> Artificial Sequence * 
<220> 

<223> Description of Artificial Sequence: primer MS65 
<400> 15 

cgaccatcat ggatgcagcg c 

<210> 16 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer MS66 
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<400> 16 

aggattcgtc gagtcgctgc tggggtcgtt 

<210> 17 
<211> 26 
<212> DNA 

<213> Artificial Sequence 



30 



<220> 

<223> Description of Artificial Sequence: primer XPANXNF 
<400> 17 

gcacgtgggc ccggcgcctc tagagc 26 

<210> 18 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer XPANXNR 
<400> 18 

gctctagagg cgccgggccc acgtgc 2 6 

<210> 19 
<211> 20 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: HIV p55 Gag 
Major Homology Region 

<400> 19 

Asp He Arg Gin Gly Pro Lys Glu Pro Phe Arg Asp Tyr Val Asp Arg 
15 10 15 

Phe Tyr Lys Thr 
20 



<210> 20 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic p55 
Gag Major Homology Region 

<400> 20 

gacatccgcc agggccccaa ggagcccttc cgcgactacg tggaccgctt ctacaagacc 60 
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<210> 21 
<211> 15 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 21 

Ala Pro Thr Lys Ala Lys Arg Arg Val Val Gin Arg Glu Lys Arg 
15 10 15 



<210> 22 
<211> 5 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 22 

Lys Ala Lys Arg Arg 
1 5 



<210> 23 
<211> 4 
<212> PRT 

<213> Human immunodeficiency virus 

<400> 23 
Arg Glu Lys Arg 
1 



<210> 24 
<211> 15 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: aa of 
mut7.SF162 cleavage site 

<400> 24 

Ala Pro Thr Lys Ala lie Ser Ser Val Val Gin Ser Glu Lys Ser 
15 10 15 



<210> 25 
<211> 15 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: aa of 
mut8.SF162 cleavage site 

<400> 25 
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Ala Pro Thr He Ala He Ser Ser Val Val Gin Ser Glu Lys Ser 
15 10 15 



<210> 26 
<211> 15 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: aa of 
mut.SF162 cleavage site 

<400> 26 

Ala Pro Thr Lys Ala Lys Arg Arg Val Val Gin Arg Glu Lys Ser 
15 10 15 



<210> 27 
<211> 15 
<212> PRT 

<213> Human immunodeficiency virus 
<220> 

<223> Description of Artificial Sequence: aa of native 
cleavage site in US4 

<400> 27 

Ala Pro Thr Gin Ala Lys Arg Arg Val Val Gin Arg Glu Lys Arg 
15 10 15 



<210> 28 
<211> 5 
<212> PRT 

<213> Human immunodeficiency virus 
<220> 

<223> Description of Artificial Sequence: aa of second 
cleavage site in US 4 

<400> 28 

Gin Ala Lys Arg Arg 
1 5 



<210> 29 
<211> 15 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: aa of mut.US4 
cleavage site 
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<400> 29 

Ala Pro Thr Gin Ala Lys Arg Arg Val Val Gin Arg Glu Lys Ser 
1 5 10 15 



<210> 30 
<211> 1419 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 30 

gtagaaaaat tgtgggtcac agtctattat ggggtacctg tgtggaaaga agcaaccacc 60 
actctatttt gtgcatcaga tgctaaagcc tatgacacag aggtacataa tgtctgggcc 12 0 
acacatgcct gtgtacccac agaccctaac ccacaagaaa tagtattgga aaatgtgaca 180 
gaaaatttta acatgtggaa aaataacatg gtagaacaga tgcatgagga tataatcagt 24 0 
ttatgggatc aaagtctaaa gccatgtgta aagttaaccc cactctgtgt tactctacat 300 
tgcactaatt tgaagaatgc tactaatacc aagagtagta attggaaaga gatggacaga 360 
ggagaaataa aaaattgctc tttcaaggtc accacaagca taagaaataa gatgcagaaa 42 0 
gaatatgcac ttttttataa acttgatgta gtaccaatag ataatgataa tacaagctat 4 80 
aaattgataa attgtaacac ctcagtcatt acacaggcct gtccaaaggt atcctttgaa 540 
ccaattccca tacattattg tgccccggct ggttttgcga ttctaaagtg taatgataag 600 
aagttcaatg gatcaggacc atgtacaaat gtcagcacag tacaatgtac acatggaatt 660 
aggccagtag tgtcaactca attgctgtta aatggcagtc tagcagaaga aggggtagta 72 0 
attagatctg aaaatttcac agacaatgct aaaactataa tagtacagct gaaggaatct 780 
gtagaaatta attgtacaag acctaacaat aatacaagaa aaagtataac tataggaccg 84 0 
gggagagcat tttatgcaac aggagacata ataggagata taagacaagc acattgtaac 900 
attagtggag aaaaatggaa taacacttta aaacagatag ttacaaaatt acaagcacaa 960 
tttgggaata aaacaatagt ctttaagcaa tcctcaggag gggacccaga aattgtaatg 102 0 
cacagtttta attgtggagg ggaatttttc tactgtaatt caacacagct ttttaatagt 1080 
acttggaata atactatagg gccaaataac actaatggaa ctatcacact cccatgcaga 1140 
ataaaacaaa ttataaacag gtggcaggaa gtaggaaaag caatgtatgc ccctcccatc 12 0 0 
agaggacaaa ttagatgctc atcaaatatt acaggactgc tattaacaag agatggtggt 1260 
aaagagatca gtaacaccac cgagatcttc agacctggag gtggagatat gagggacaat 132 0 
tggagaagtg aattatataa atataaagta gtaaaaattg agccattagg agtagcaccc 1380 
accaaggcaa agagaagagt ggtgcagaga gaaaaaaga 1419 

<210> 31 
<211> 1932 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 31 

gtagaaaaat tgtgggtcac agtctattat ggggtacctg tgtggaaaga agcaaccacc 60 
actctatttt gtgcatcaga tgctaaagcc tatgacacag aggtacataa tgtctgggcc 12 0 
acacatgcct gtgtacccac agaccctaac ccacaagaaa tagtattgga aaatgtgaca 18 0 
gaaaatttta acatgtggaa aaataacatg gtagaacaga tgcatgagga tataatcagt 24 0 
ttatgggatc aaagtctaaa gccatgtgta aagttaaccc cactctgtgt tactctacat 3 00 
tgcactaatt tgaagaatgc tactaatacc aagagtagta attggaaaga gatggacaga 3 60 
ggagaaataa aaaattgctc tttcaaggtc accacaagca taagaaataa gatgcagaaa 42 0 
gaatatgcac ttttttataa acttgatgta gtaccaatag ataatgataa tacaagctat 4 80 
aaattgataa attgtaacac ctcagtcatt acacaggcct gtccaaaggt atcctttgaa 540 
ccaattccca tacattattg tgccccggct ggttttgcga ttctaaagtg taatgataag 600 
aagttcaatg gatcaggacc atgtacaaat gtcagcacag tacaatgtac acatggaatt 660 
aggccagtag tgtcaactca attgctgtta aatggcagtc tagcagaaga aggggtagta 72 0 
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attagatctg aaaatttcac agacaatgct 
gtagaaatta attgtacaag acctaacaat 
gggagagcat tttatgcaac aggagacata 
attagtggag aaaaatggaa taacacttta 
tttgggaata aaacaatagt ctttaagcaa 
cacagtttta attgtggagg ggaatttttc 
acttggaata atactatagg gccaaataac 
ataaaacaaa ttataaacag gtggcaggaa 
agaggacaaa ttagatgctc atcaaatatt 
aaagagatca gtaacaccac cgagatcttc 
tggagaagtg aattatataa atataaagta 
accaaggcaa agagaagagt ggtgcagaga 
ttccttgggt tcttgggagc agcaggaagc 
gtacaggcca gacaattatt gtctggtata 
attgaggcgc aacagcatct gttgcaactc 
agagtcctgg ctgtggaaag atacctaaag 
tctggaaaac tcatttgcac cactgctgtg 
ctggatcaga tttggaataa catgacctgg 
acaaacttaa tatacacctt aattgaagaa 
gaattattag aattggataa gtgggcaagt 
ctgtggtata ta 



aaaactataa tagtacagct gaaggaatct 780 
aatacaagaa aaagtataac tataggaccg 840 
ataggagata taagacaagc acattgtaac 900 
aaacagatag ttacaaaatt acaagcacaa 960 
tcctcaggag gggacccaga aattgtaatg 102 0 
tactgtaatt caacacagct ttttaatagt 1080 
actaatggaa ctatcacact cccatgcaga 1140 
gtaggaaaag caatgtatgc ccctcccatc 12 00 
acaggactgc tattaacaag agatggtggt 12 60 
agacctggag gtggagatat gagggacaat 132 0 
gtaaaaattg agccattagg agtagcaccc 13 8 G 
gaaaaaagag cagtgacgct aggagctatg 1440 
actatgggcg cacggtcact gacgctgacg 1500 
gtgcaacagc agaacaattt gctgagagct 1560 
acagtctggg gcatcaagca gctccaggca 162 0 
gatcaacagc tcctagggat ttggggttgc 1680 
ccttggaatg ctagttggag taataaatct 1740 
atggagtggg agagagaaat tgacaattac 1800 
tcgcagaacc aacaagaaaa gaatgaacaa 1860 
ttgtggaatt ggtttgacat atcaaaatgg 192 0 

1932 



<210> 32 
<211> 2457 
<212> DNA 

<213> Human immunodeficiency virus 



<400> 32 

gtagaaaaat tgtgggtcac agtctattat 
actctatttt gtgcatcaga tgctaaagcc 
acacatgcct gtgtacccac agaccctaac 
gaaaatttta acatgtggaa aaataacatg 
ttatgggatc aaagtctaaa gccatgtgta 
tgcactaatt tgaagaatgc tactaatacc 
ggagaaataa aaaattgctc tttcaaggtc 
gaatatgcac ttttttataa acttgatgta 
aaattgataa attgtaacac ctcagtcatt 
ccaattccca tacattattg tgccccggct 
aagttcaatg gatcaggacc atgtacaaat 
aggccagtag tgtcaactca attgctgtta 
attagatctg aaaatttcac agacaatgct 
gtagaaatta attgtacaag acctaacaat 
gggagagcat tttatgcaac aggagacata 
attagtggag aaaaatggaa taacacttta 
tttgggaata aaacaatagt ctttaagcaa 
cacagtttta attgtggagg ggaatttttc 
acttggaata atactatagg gccaaataac 
ataaaacaaa ttataaacag gtggcaggaa 
agaggacaaa ttagatgctc atcaaatatt 
aaagagatca gtaacaccac cgagatcttc 
tggagaagtg aattatataa atataaagta 
accaaggcaa agagaagagt ggtgcagaga 
ttccttgggt tcttgggagc agcaggaagc 
gtacaggcca gacaattatt gtctggtata 



ggggtacctg tgtggaaaga agcaaccacc 60 
tatgacacag aggtacataa tgtctgggcc 120 
ccacaagaaa tagtattgga aaatgtgaca 18 0 
gtagaacaga tgcatgagga tataatcagt 24 0 
aagttaaccc cactctgtgt tactctacat 300 
aagagtagta attggaaaga gatggacaga 360 
accacaagca taagaaataa gatgcagaaa 42 0 
gtaccaatag ataatgataa tacaagctat 4 80 
acacaggcct gtccaaaggt atcctttgaa 540 
ggttttgcga ttctaaagtg taatgataag 600 
gtcagcacag tacaatgtac acatggaatt 660 
aatggcagtc tagcagaaga aggggtagta 72 0 
aaaactataa tagtacagct gaaggaatct 780 
aatacaagaa aaagtataac tataggaccg 840 
ataggagata taagacaagc acattgtaac 900 
aaacagatag ttacaaaatt acaagcacaa 960 
tcctcaggag gggacccaga aattgtaatg 102 0 
tactgtaatt caacacagct ttttaatagt 1080 
actaatggaa ctatcacact cccatgcaga 114 0 
gtaggaaaag caatgtatgc ccctcccatc 12 0 0 
acaggactgc tattaacaag agatggtggt 12 60 
agacctggag gtggagatat gagggacaat 132 0 
gtaaaaattg agccattagg agtagcaccc 13 8 0 
gaaaaaagag cagtgacgct aggagctatg 1440 
actatgggcg cacggtcact gacgctgacg 1500 
gtgcaacagc agaacaattt gctgagagct 1560 
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attgaggcgc aacagcatct gttgcaactc acagtctggg gcatcaagca 
agagtcctgg ctgtggaaag atacctaaag gatcaacagc tcctagggat 
tctggaaaac tcatttgcac cactgctgtg ccttggaatg ctagttggag 
ctggatcaga tttggaataa catgacctgg atggagtggg agagagaaat 
acaaacttaa tatacacctt aattgaagaa tcgcagaacc aacaagaaaa 
gaattattag aattggataa gtgggcaagt ttgtggaatt ggtttgacat 
ctgtggtata taaaaatatt cataatgata gtaggaggtt tagtaggttt 
tttactgtgc tttctatagt gaatagagtt aggcagggat actcaccatt 
acccgcttcc cagccccaag gggacccgac aggcccgaag gaatcgaaga 
gagagagaca gagacagatc cagtccatta gtgcatggat tattagcact 
gatctacgga gcctgtgcct cttcagctac caccgcttga gagacttaat 
gcgaggattg tggaacttct gggacgcagg gggtgggaag ccctcaagta 
ctcctgcagt attggattca ggaactaaag aatagtgctg ttagtttgtt 
gctatagcag tagctgaggg gacagatagg attatagaag tagcacaaag 
gcttttctcc acatacctag aagaataaga cagggctttg aaagggcttt 

<210> 33 
<211> 1453 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: gpl20 .modSF162 
<400> 33 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 72 0 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 7 80 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 102 0 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctggca ggaggtgggc 12 60 
aaggccatgt acgccccccc catccgcggc cagatccgct gcagcagcaa catcaccggc 132 0 
ctgctgctga cccgcgacgg cggcaaggag atcagcaaca ccaccgagat cttccgcccc 13 80 
ggcggcggcg acatgcgcga caactggcgc agcgagctgt acaagtacaa ggtggtgaag 144 0 
atcgagcccc tgg 1453 

<210> 34 
<211> 1387 
<212> DNA 



gctccaggca 162 0 
ttggggttgc 1680 
taataaatct 1740 
tgacaattac 1800 
gaatgaacaa 1860 
atcaaaatgg 1920 
aaggatagtt 1980 
atcatttcag 2040 
agaaggtgga 2100 
catctgggac 2160 
cttgattgca 222Q 

ttgggggaat 228 0 

tgatgccata 2340 
aattggtaga 2400 
gctataa 2457 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl2 0.modSF162 .delV2 

<400> 34 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 ; 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 42 0 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgggcgcc 480 
ggcaagctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 540 
gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaacgac 600 
aagaagttca acggcagcgg cccctgcacc aacgtgagca ccgtgcagtg cacccacggc 660 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggagggcgtg 720 
gtgatccgca gcgagaactt caccgacaac gccaagacca tcatcgtgca gctgaaggag 780 
agcgtggaga tcaactgcac ccgccccaac aacaacaccc gcaagagcat caccatcggc 840 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 900 
aacatcagcg gcgagaagtg gaacaacacc ctgaagcaga tcgtgaccaa gctgcaggcc 960 
cagttcggca acaagaccat cgtgttcaag cagagcagcg gcggcgaccc cgagatcgtg 102 0 
atgcacagct tcaactgcgg cggcgagttc ttctactgca acagcaccca gctgttcaac 108 0 
agcacctgga acaacaccat cggccccaac aacaccaacg gcaccatcac cctgccctgc 114 0 
cgcatcaagc agatcatcaa ccgctggcag gaggtgggca aggccatgta cgcccccccc 12 00 
atccgcggcc agatccgctg cagcagcaac atcaccggcc tgctgctgac ccgcgacggc 1260 
ggcaaggaga tcagcaacac caccgagatc ttccgccccg gcggcggcga catgcgcgac 1320 
aactggcgca gcgagctgta caagtacaag gtggtgaaga tcgagcccct gggcgtggcc 13 80 
cccacca 13 87 

<210> 35 
<211> 1323 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl2 0.modSF162 ,delVlV2 

<400> 35 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
acccccctgt gcgtgggcgc cggcaactgc 
aaggtgagct tcgagcccat ccccatccac 
aagtgcaacg acaagaagtt caacggcagc 
tgcacccacg gcatccgccc cgtggtgagc 
gaggagggcg tggtgatccg cagcgagaac 
cagctgaagg agagcgtgga gatcaactgc 



gggctctgct gtgtgctgct gctgtgtgga 60 
aagctgtggg tgaccgtgta ctacggcgtg 12 0 
ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 24 0 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgaagctg 360 
cagaccagcg tgatcaccca ggcctgcccc 42 0 
tactgcgccc ccgccggctt cgccatcctg 4 80 
ggcccctgca ccaacgtgag caccgtgcag 540 
acccagctgc tgctgaacgg cagcctggcc 600 
ttcaccgaca acgccaagac catcatcgtg 660 
acccgcccca acaacaacac ccgcaagagc 72 0 
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atcaccatcg gccccggccg cgccttctac gccaccggcg 
caggcccact gcaacatcag cggcgagaag tggaacaaca 
aagctgcagg cccagttcgg caacaagacc atcgtgttca 
cccgagatcg tgatgcacag cttcaactgc ggcggcgagt 
cagctgttca acagcacctg gaacaacacc atcggcccca 
accctgccct gccgcatcaa gcagatcatc aaccgctggc 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca 
acccgcgacg gcggcaagga gatcagcaac accaccgaga 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca 
ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc 
gag 

<210> 36 
<211> 2025 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: gp!4 0 ,modSF162 



acatcatcgg cgacatccgc 78 0 
ccctgaagca gatcgtgacc 84 0 
agcagagcag cggcggcgac 900 
tcttctactg caacagcacc 960 
acaacaccaa cggcaccatc 102 0 
aggaggtggg caaggccatg 108 0 
acatcaccgg cctgctgctg 114 0 
tcttccgccc cggcggcggc 12 0 0 
aggtggtgaa gatcgagccc 1260 
agcgcgagaa gcgctaactc 132 0 

1323 



<400> 36 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 42 0 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 7 80 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 102 0 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 12 00 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctggca ggaggtgggc 12 60 
aaggccatgt acgccccccc catccgcggc cagatccgct gcagcagcaa catcaccggc 1320 
ctgctgctga cccgcgacgg cggcaaggag atcagcaaca ccaccgagat cttccgcccc 13 80 
ggcggcggcg acatgcgcga caactggcgc agcgagctgt acaagtacaa ggtggtgaag 144 0 
atcgagcccc tgggcgtggc ccccaccaag gccaagcgcc gcgtggtgca gcgcgagaag 1500 
cgcgccgtga ccctgggcgc catgttcctg ggcttcctgg gcgccgccgg cagcaccatg 1560 
ggcgcccgca gcctgaccct gaccgtgcag gcccgccagc tgctgagcgg catcgtgcag 162 0 
cagcagaaca acctgctgcg cgccatcgag gcccagcagc acctgctgca gctgaccgtg 1680 
tggggcatca agcagctgca ggcccgcgtg ctggccgtgg agcgctacct gaaggaccag 174 0 
cagctgctgg gcatctgggg ctgcagcggc aagctgatct gcaccaccgc cgtgccctgg 1800 
aacgccagct ggagcaacaa gagcctggac cagatctgga acaacatgac ctggatggag 1860 
tgggagcgcg agatcgacaa ctacaccaac ctgatctaca ccctgatcga ggagagccag 192 0 
aaccagcagg agaagaacga gcaggagctg ctggagctgg acaagtgggc cagcctgtgg 1980 
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aactggttcg acatcagcaa gtggctgtgg tacatctaac tcgag 



2025 



<210> 37 
<211> 1944 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl40.modSF162 .delV2 

<400> 37 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 42 0 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgggcgcc 48 0 
ggcaagctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 54 0 
gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaacgac 600 
aagaagttca acggcagcgg cccctgcacc aacgtgagca ccgtgcagtg cacccacggc 660 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggagggcgtg 720 
gtgatccgca gcgagaactt caccgacaac gccaagacca tcatcgtgca gctgaaggag 78 0 
agcgtggaga tcaactgcac ccgccccaac aacaacaccc gcaagagcat caccatcggc 84 0 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 900 
aacatcagcg gcgagaagtg gaacaacacc ctgaagcaga tcgtgaccaa gctgcaggcc 960 
cagttcggca acaagaccat cgtgttcaag cagagcagcg gcggcgaccc cgagatcgtg 1020 
atgcacagct tcaactgcgg cggcgagttc ttctactgca acagcaccca gctgttcaac 1080 
agcacctgga acaacaccat cggccccaac aacaccaacg gcaccatcac cctgccctgc 1140 
cgcatcaagc agatcatcaa ccgctggcag gaggtgggca aggccatgta cgcccccccc 12 00 
atccgcggcc agatccgctg cagcagcaac atcaccggcc tgctgctgac ccgcgacggc 12 60 
ggcaaggaga tcagcaacac caccgagatc ttccgccccg gcggcggcga catgcgcgac 132 0 
aactggcgca gcgagctgta caagtacaag gtggtgaaga tcgagcccct gggcgtggcc 13 80 
cccaccaagg ccaagcgccg cgtggtgcag cgcgagaagc gcgccgtgac cctgggcgcc 1440 
atgttcctgg gcttcctggg cgccgccggc agcaccatgg gcgcccgcag cctgaccctg 1500 
accgtgcagg cccgccagct gctgagcggc atcgtgcagc agcagaacaa cctgctgcgc 1560 
gccatcgagg cccagcagca cctgctgcag ctgaccgtgt ggggcatcaa gcagctgcag 162 0 
gcccgcgtgc tggccgtgga gcgctacctg aaggaccagc agctgctggg catctggggc 168 0 
tgcagcggca agctgatctg caccaccgcc gtgccctgga acgccagctg gagcaacaag 1740 
agcctggacc agatctggaa caacatgacc tggatggagt gggagcgcga gatcgacaac 1800 
tacaccaacc tgatctacac cctgatcgag gagagccaga accagcagga gaagaacgag 1860 
caggagctgc tggagctgga caagtgggcc agcctgtgga actggttcga catcagcaag 1920 
tggctgtggt acatctaact cgag 1944 

<210> 38 
<211> 1944 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl40 .modSF162 ,delVl/V2 
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<400> 38 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgggcgcc 4 80 
ggcaagctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 540 
gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaacgac 600 r 
aagaagttca acggcagcgg cccctgcacc aacgtgagca ccgtgcagtg cacccacggc 660 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggagggcgtg 72 0 
gtgatccgca gcgagaactt caccgacaac gccaagacca tcatcgtgca gctgaaggag 78 0 
agcgtggaga tcaactgcac ccgccccaac aacaacaccc gcaagagcat caccatcggc 84 0 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 900 
aacatcagcg gcgagaagtg gaacaacacc ctgaagcaga tcgtgaccaa gctgcaggcc 960 
cagttcggca acaagaccat cgtgttcaag cagagcagcg gcggcgaccc cgagatcgtg 102 0 
atgcacagct tcaactgcgg cggcgagttc ttctactgca acagcaccca gctgttcaac 1080 
agcacctgga acaacaccat cggccccaac aacaccaacg gcaccatcac cctgccctgc 1140 
cgcatcaagc agatcatcaa ccgctggcag gaggtgggca aggccatgta cgcccccccc 1200 
atccgcggcc agatccgctg cagcagcaac atcaccggcc tgctgctgac ccgcgacggc 1260 
ggcaaggaga tcagcaacac caccgagatc ttccgccccg gcggcggcga catgcgcgac 132 0 
aactggcgca gcgagctgta caagtacaag gtggtgaaga tcgagcccct gggcgtggcc 13 80 
cccaccaagg ccaagcgccg cgtggtgcag cgcgagaagc gcgccgtgac cctgggcgcc 144 0 
atgttcctgg gcttcctggg cgccgccggc agcaccatgg gcgcccgcag cctgaccctg 1500 
accgtgcagg cccgccagct gctgagcggc atcgtgcagc agcagaacaa cctgctgcgc 1560 
gccatcgagg cccagcagca cctgctgcag ctgaccgtgt ggggcatcaa gcagctgcag 162 0 
gcccgcgtgc tggccgtgga gcgctacctg aaggaccagc agctgctggg catctggggc 168 0 
tgcagcggca agctgatctg caccaccgcc gtgccctgga acgccagctg gagcaacaag 174 0 
agcctggacc agatctggaa caacatgacc tggatggagt gggagcgcga gatcgacaac 1800 
tacaccaacc tgatctacac cctgatcgag gagagccaga accagcagga gaagaacgag 1860 
caggagctgc tggagctgga caagtgggcc agcctgtgga actggttcga catcagcaag 1920 
tggctgtggt acatctaact cgag 1944 

<210> 39 
<211> 2025 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl4 0 .mut .mociSF162 

<400> 39 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 3 60 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 48 0 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 
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atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 * 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 102 0 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 12 00 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctggca ggaggtgggc 12 60 
aaggccatgt acgccccccc catccgcggc cagatccgct gcagcagcaa catcaccggc 132 0 
ctgctgctga cccgcgacgg cggcaaggag atcagcaaca ccaccgagat cttccgcccc 13 8 0 
ggcggcggcg acatgcgcga caactggcgc agcgagctgt acaagtacaa ggtggtgaag 1440 
atcgagcccc tgggcgtggc ccccaccaag gccaagcgcc gcgtggtgca gcgcgagaag 1500 
agcgccgtga ccctgggcgc catgttcctg ggcttcctgg gcgccgccgg cagcaccatg 1560 
ggcgcccgca gcctgaccct gaccgtgcag gcccgccagc tgctgagcgg catcgtgcag 162 0 
cagcagaaca acctgctgcg cgccatcgag gcccagcagc acctgctgca gctgaccgtg 1680 
tggggcatca agcagctgca ggcccgcgtg ctggccgtgg agcgctacct gaaggaccag 174 0 
cagctgctgg gcatctgggg ctgcagcggc aagctgatct gcaccaccgc cgtgccctgg 1800 
aacgccagct ggagcaacaa gagcctggac cagatctgga acaacatgac ctggatggag 1860 
tgggagcgcg agatcgacaa ctacaccaac ctgatctaca ccctgatcga ggagagccag 192 0 
aaccagcagg agaagaacga gcaggagctg ctggagctgg acaagtgggc cagcctgtgg 1980 
aactggttcg acatcagcaa gtggctgtgg tacatctaac tcgag 2 02 5 

<210> 40 

<211> 1944 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gp 1 4 0 . mu t * mods F 1 6 2 . de 1 V2 

<400> 40 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgggcgcc 48 0 
ggcaagctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 54 0 
gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaacgac 600 
aagaagttca acggcagcgg cccctgcacc aacgtgagca ccgtgcagtg cacccacggc 660 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggagggcgtg 720 
gtgatccgca gcgagaactt caccgacaac gccaagacca tcatcgtgca gctgaaggag 780 
agcgtggaga tcaactgcac ccgccccaac aacaacaccc gcaagagcat caccatcggc 84 0 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 900 
aacatcagcg gcgagaagtg gaacaacacc ctgaagcaga tcgtgaccaa gctgcaggcc 960 
cagttcggca acaagaccat cgtgttcaag cagagcagcg gcggcgaccc cgagatcgtg 102 0 
atgcacagct tcaactgcgg cggcgagttc ttctactgca acagcaccca gctgttcaac 1080 
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agcacctgga acaacaccat cggccccaac aacaccaacg gcaccatcac cctgccctgc 1140 
cgcatcaagc agatcatcaa ccgctggcag gaggtgggca aggccatgta cgcccccccc 1200 
atccgcggcc agatccgctg cagcagcaac atcaccggcc tgctgctgac ccgcgacggc 1260 
ggcaaggaga tcagcaacac caccgagatc ttccgccccg gcggcggcga catgcgcgac 13 2 0 
aactggcgca gcgagctgta caagtacaag gtggtgaaga tcgagcccct gggcgtggcc 1380 
cccaccaagg ccaagcgccg cgtggtgcag cgcgagaaga gcgccgtgac cctgggcgcc 1440 
atgttcctgg gcttcctggg cgccgccggc agcaccatgg gcgcccgcag cctgaccctg 1500 
accgtgcagg cccgccagct gctgagcggc atcgtgcagc agcagaacaa cctgctgcgc 1560 
gccatcgagg cccagcagca cctgctgcag ctgaccgtgt ggggcatcaa gcagctgcag 162 0 
gcccgcgtgc tggccgtgga gcgctacctg aaggaccagc agctgctggg catctggggc 1680 
tgcagcggca agctgatctg caccaccgcc gtgccctgga acgccagctg gagcaacaag 174 Q 
agcctggacc agatctggaa caacatgacc tggatggagt gggagcgcga gatcgacaac 18 00 
tacaccaacc tgatctacac cctgatcgag gagagccaga accagcagga gaagaacgag 1860 
caggagctgc tggagctgga caagtgggcc agcctgtgga actggttcga catcagcaag 192 0 
tggctgtggt acatctaact cgag 1944 

<210> 41 
<211> 1836 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gp 1 4 0 . mu t . modSF 1 6 2 . de 1 VI / V2 

<400> 41 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 18 0 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggcgc cggcaactgc cagaccagcg tgatcaccca ggcctgcccc 42 0 
aaggtgagct tcgagcccat ccccatccac tactgcgccc ccgccggctt cgccatcctg 480 
aagtgcaacg acaagaagtt caacggcagc ggcccctgca ccaacgtgag caccgtgcag 540 
tgcacccacg gcatccgccc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 600 
gaggagggcg tggtgatccg cagcgagaac ttcaccgaca acgccaagac catcatcgtg 660 
cagctgaagg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 720 
atcaccatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 78 0 
caggcccact gcaacatcag cggcgagaag tggaacaaca ccctgaagca gatcgtgacc 84 0 
aagctgcagg cccagttcgg caacaagacc atcgtgttca agcagagcag cggcggcgac 900 
cccgagatcg tgatgcacag cttcaactgc ggcggcgagt tcttctactg caacagcacc 960 
cagctgttca acagcacctg gaacaacacc atcggcccca acaacaccaa cggcaccatc 102 0 
accctgccct gccgcatcaa gcagatcatc aaccgctggc aggaggtggg caaggccatg 1080 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 114 0 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 12 00 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1260 
ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gagcgccgtg 132 0 
accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 13 8 0 
agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 1440 
aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 1500 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1560 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 162 0 
tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1680 



gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 174 0 
gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 1800 
gacatcagca agtggctgtg gtacatctaa ctcgag 1836 



<210> 42 
<211> 2025 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: ~ 
gpl40 .mut7 .modSF162 

<400> 42 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 3 60 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 42 0 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 48 0 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 72 0 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 7 80 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 84 0 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 102 0 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 12 00 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctggca ggaggtgggc 12 60 
aaggccatgt acgccccccc catccgcggc cagatccgct gcagcagcaa catcaccggc 132 0 
ctgctgctga cccgcgacgg cggcaaggag atcagcaaca ccaccgagat cttccgcccc 13 8 0 
ggcggcggcg acatgcgcga caactggcgc agcgagctgt acaagtacaa ggtggtgaag 144 0 
atcgagcccc tgggcgtggc ccccaccaag gccatcagca gcgtggtgca gagcgagaag 1500 
agcgccgtga ccctgggcgc catgttcctg ggcttcctgg gcgccgccgg cagcaccatg 1560 
ggcgcccgca gcctgaccct gaccgtgcag gcccgccagc tgctgagcgg catcgtgcag 162 0 
cagcagaaca acctgctgcg cgccatcgag gcccagcagc acctgctgca gctgaccgtg 1680 
tggggcatca agcagctgca ggcccgcgtg ctggccgtgg agcgctacct gaaggaccag 174 0 
cagctgctgg gcatctgggg ctgcagcggc aagctgatct gcaccaccgc cgtgccctgg 1800 
aacgccagct ggagcaacaa gagcctggac cagatctgga acaacatgac ctggatggag 1860 
tgggagcgcg agatcgacaa ctacaccaac ctgatctaca ccctgatcga ggagagccag 192 0 
aaccagcagg agaagaacga gcaggagctg ctggagctgg acaagtgggc cagcctgtgg 198 0 
aactggttcg acatcagcaa gtggctgtgg tacatctaac tcgag 2 025 

<210> 43 
<211> 1944 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: 
gpl40 .mut7 .modSF162 *delV2 

<400> 43 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 = 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 42 0 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgggcgcc 4 80 
ggcaagctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 540 
gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaatcgac 600 
aagaagttca acggcagcgg cccctgcacc aacgtgagca ccgtgcagtg cacccacggc 660 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggagggcgtg 72 0 
gtgatccgca gcgagaactt caccgacaac gccaagacca tcatcgtgca gctgaaggag 780 
agcgtggaga tcaactgcac ccgccccaac aacaacaccc gcaagagcat caccatcggc 84 0 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 900 
aacatcagcg gcgagaagtg gaacaacacc ctgaagcaga tcgtgaccaa gctgcaggcc 960 
cagttcggca acaagaccat cgtgttcaag cagagcagcg gcggcgaccc cgagatcgtg 102 0 
atgcacagct tcaactgcgg cggcgagttc ttctactgca acagcaccca gctgttcaac 1080 
agcacctgga acaacaccat cggccccaac aacaccaacg gcaccatcac cctgccctgc 1140 
cgcatcaagc agatcatcaa ccgctggcag gaggtgggca aggccatgta cgcccccccc 12 00 
atccgcggcc agatccgctg cagcagcaac atcaccggcc tgctgctgac ccgcgacggc 12 60 
ggcaaggaga tcagcaacac caccgagatc ttccgccccg gcggcggcga catgcgcgac 132 0 
aactggcgca gcgagctgta caagtacaag gtggtgaaga tcgagcccct gggcgtggcc 13 80 
cccaccaagg ccatcagcag cgtggtgcag agcgagaaga gcgccgtgac cctgggcgcc 144 0 
atgttcctgg gcttcctggg cgccgccggc agcaccatgg gcgcccgcag cctgaccctg 150 0 
accgtgcagg cccgccagct gctgagcggc atcgtgcagc agcagaacaa cctgctgcgc 1560 
gccatcgagg cccagcagca cctgctgcag ctgaccgtgt ggggcatcaa gcagctgcag 162 0 
gcccgcgtgc tggccgtgga gcgctacctg aaggaccagc agctgctggg catctggggc 1680 
tgcagcggca agctgatctg caccaccgcc gtgccctgga acgccagctg gagcaacaag 1740 
agcctggacc agatctggaa caacatgacc tggatggagt gggagcgcga gatcgacaac 1800 
tacaccaacc tgatctacac cctgatcgag gagagccaga accagcagga gaagaacgag 1860 
caggagctgc tggagctgga caagtgggcc agcctgtgga actggttcga catcagcaag 192 0 
tggctgtggt acatctaact cgag 1944 

<210> 44 
<211> 1836 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl40 ,mut7 .modSF162 .delVl/V2 

<400> 44 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 



24 



cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggcgc cggcaactgc cagaccagcg tgatcaccca ggcctgcccc 42 0 
aaggtgagct tcgagcccat ccccatccac tactgcgccc ccgccggctt cgccatcctg 4 80 
aagtgcaacg acaagaagtt caacggcagc ggcccctgca ccaacgtgag caccgtgcag 540 
tgcacccacg gcatccgccc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 600 
gaggagggcg tggtgatccg cagcgagaac ttcaccgaca acgccaagac catcatcgtg 660 
cagctgaagg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 72 0 
atcaccatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 780 
caggcccact gcaacatcag cggcgagaag tggaacaaca ccctgaagca gatcgtgacc 84 0 
aagctgcagg cccagttcgg caacaagacc atcgtgttca agcagagcag cggcggcgac 900 
cccgagatcg tgatgcacag cttcaactgc ggcggcgagt tcttctactg caacagcacc 960 r 
cagctgttca acagcacctg gaacaacacc atcggcccca acaacaccaa cggcaccatc 1020 
accctgccct gccgcatcaa gcagatcatc aaccgctggc aggaggtggg caaggccatg 1080 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1140 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 12 0 0 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1260 
ctgggcgtgg cccccaccaa ggccatcagc agcgtggtgc agagcgagaa gagcgccgtg 132 0 
accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 13 80 
agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 1440 
aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 1500 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1560 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 162 0 
tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1680 
gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1740 
gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 18 00 
gacatcagca agtggctgtg gtacatctaa ctcgag 183 6 

<210> 45 
<211> 2025 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl40 .mut8 ,modSF162 

<400> 45 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 3 60 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 7 80 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
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atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 12 00 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctggca ggaggtgggc 1260 
aaggccatgt acgccccccc catccgcggc cagatccgct gcagcagcaa catcaccggc 132 0 
ctgctgctga cccgcgacgg cggcaaggag atcagcaaca ccaccgagat cttccgcccc 1380 
ggcggcggcg acatgcgcga caactggcgc agcgagctgt acaagtacaa ggtggtgaag 1440 
atcgagcccc tgggcgtggc ccccaccatc gccatcagca gcgtggtgca gagcgagaag 150 0 
agcgccgtga ccctgggcgc catgttcctg ggcttcctgg gcgccgccgg cagcaccatg 1560 
ggcgcccgca gcctgaccct gaccgtgcag gcccgccagc tgctgagcgg catcgtgcag 1620 
cagcagaaca acctgctgcg cgccatcgag gcccagcagc acctgctgca gctgaccgtg 16 80 
tggggcatca agcagctgca ggcccgcgtg ctggccgtgg agcgctacct gaaggaccag 1740 
cagctgctgg gcatctgggg ctgcagcggc aagctgatct gcaccaccgc cgtgccctgg 1800 
aacgccagct ggagcaacaa gagcctggac cagatctgga acaacatgac ctggatggag 1860 
tgggagcgcg agatcgacaa ctacaccaac ctgatctaca ccctgatcga ggagagccag 192 0 
aaccagcagg agaagaacga gcaggagctg ctggagctgg acaagtgggc cagcctgtgg 1980 
aactggttcg acatcagcaa gtggctgtgg tacatctaac tcgag 2 02 5 

<210> 46 
<211> 1944 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl40 .mut8 .modSF162 .delV2 

<400> 46 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 42 0 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgggcgcc 480 
ggcaagctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 540 
gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaacgac 600 
aagaagttca acggcagcgg cccctgcacc aacgtgagca ccgtgcagtg cacccacggc 660 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggagggcgtg 72 0 
gtgatccgca gcgagaactt caccgacaac gccaagacca tcatcgtgca gctgaaggag 780 
agcgtggaga tcaactgcac ccgccccaac aacaacaccc gcaagagcat caccatcggc 84 0 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 900 
aacatcagcg gcgagaagtg gaacaacacc ctgaagcaga tcgtgaccaa gctgcaggcc 960 
cagttcggca acaagaccat cgtgttcaag cagagcagcg gcggcgaccc cgagatcgtg 1020 
atgcacagct tcaactgcgg cggcgagttc ttctactgca acagcaccca gctgttcaac 108 0 
agcacctgga acaacaccat cggccccaac aacaccaacg gcaccatcac cctgccctgc 114 0 
cgcatcaagc agatcatcaa ccgctggcag gaggtgggca aggccatgta cgcccccccc 12 00 
atccgcggcc agatccgctg cagcagcaac atcaccggcc tgctgctgac ccgcgacggc 1260 
ggcaaggaga tcagcaacac caccgagatc ttccgccccg gcggcggcga catgcgcgac 132 0 
aactggcgca gcgagctgta caagtacaag gtggtgaaga tcgagcccct gggcgtggcc 13 80 
cccaccatcg ccatcagcag cgtggtgcag agcgagaaga gcgccgtgac cctgggcgcc 144 0 
atgttcctgg gcttcctggg cgccgccggc agcaccatgg gcgcccgcag cctgaccctg 1500 
accgtgcagg cccgccagct gctgagcggc atcgtgcagc agcagaacaa cctgctgcgc 1560 
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gccatcgagg cccagcagca cctgctgcag ctgaccgtgt ggggcatcaa gcagctgcag 1620 
gcccgcgtgc tggccgtgga gcgctacctg aaggaccagc agctgctggg catctggggc 1680 
tgcagcggca agctgatctg caccaccgcc gtgccctgga acgccagctg gagcaacaag 1740 
agcctggacc agatctggaa caacatgacc tggatggagt gggagcgcga gatcgacaac 1800 
tacaccaacc tgatctacac cctgatcgag gagagccaga accagcagga gaagaacgag 1860 
caggagctgc tggagctgga caagtgggcc agcctgtgga actggttcga catcagcaag 192 0 
tggctgtggt acatctaact cgag 1944 



<210> 47 
<211> 1836 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
gpl40 ,mut8 ,modSF162 .delVl/V2 



<400> 47 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
acccccctgt gcgtgggcgc cggcaactgc 
aaggtgagct tcgagcccat ccccatccac 
aagtgcaacg acaagaagtt caacggcagc 
tgcacccacg gcatccgccc cgtggtgagc 
gaggagggcg tggtgatccg cagcgagaac 
cagctgaagg agagcgtgga gatcaactgc 
atcaccatcg gccccggccg cgccttctac 
caggcccact gcaacatcag cggcgagaag 
aagctgcagg cccagttcgg caacaagacc 
cccgagatcg tgatgcacag cttcaactgc 
cagctgttca acagcacctg gaacaacacc 
accctgccct gccgcatcaa gcagatcatc 
tacgcccccc ccatccgcgg ccagatccgc 
acccgcgacg gcggcaagga gatcagcaac 
gacatgcgcg acaactggcg cagcgagctg 
ctgggcgtgg cccccaccat cgccatcagc 
accctgggcg ccatgttcct gggcttcctg 
agcctgaccc tgaccgtgca ggcccgccag 
aacctgctgc gcgccatcga ggcccagcag 
aagcagctgc aggcccgcgt gctggccgtg 
ggcatctggg gctgcagcgg caagctgatc 
tggagcaaca agagcctgga ccagatctgg 
gagatcgaca actacaccaa cctgatctac 
gagaagaacg agcaggagct gctggagctg 
gacatcagca agtggctgtg gtacatctaa 



gggctctgct gtgtgctgct gctgtgtgga 60 
aagctgtggg tgaccgtgta ctacggcgtg 12 0 
ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 24 0 
ttcaacatgt ggaagaacaa catggtggag 3 00 
gaccagagcc tgaagccctg cgtgaagctg 360 
cagaccagcg tgatcaccca ggcctgcccc 42 0 
tactgcgccc ccgccggctt cgccatcctg 480 
ggcccctgca ccaacgtgag caccgtgcag 54 0 
acccagctgc tgctgaacgg cagcctggcc 600 
ttcaccgaca acgccaagac catcatcgtg 660 
acccgcccca acaacaacac ccgcaagagc 72 0 
gccaccggcg acatcatcgg cgacatccgc 780 
tggaacaaca ccctgaagca gatcgtgacc 840 
atcgtgttca agcagagcag cggcggcgac 900 
ggcggcgagt tcttctactg caacagcacc 960 
atcggcccca acaacaccaa cggcaccatc 102 0 
aaccgctggc aggaggtggg caaggccatg 1080 
tgcagcagca acatcaccgg cctgctgctg 114 0 
accaccgaga tcttccgccc cggcggcggc 120 0 
tacaagtaca aggtggtgaa gatcgagccc 1260 
agcgtggtgc agagcgagaa gagcgccgtg 1320 
ggcgccgccg gcagcaccat gggcgcccgc 13 80 
ctgctgagcg gcatcgtgca gcagcagaac 1440 
cacctgctgc agctgaccgt gtggggcatc 1500 
gagcgctacc tgaaggacca gcagctgctg 1560 
tgcaccaccg ccgtgccctg gaacgccagc 162 0 
aacaacatga cctggatgga gtgggagcgc 1680 
accctgatcg aggagagcca gaaccagcag 1740 
gacaagtggg ccagcctgtg gaactggttc 1800 
ctcgag 1836 



<210> 48 
<211> 2547 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: gpl60 .mociSF162 
<400> 48 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 s 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 60 0 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 84 0 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 108 0 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 114 0 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 12 00 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctggca ggaggtgggc 12 60 
aaggccatgt acgccccccc catccgcggc cagatccgct gcagcagcaa catcaccggc 132 0 
ctgctgctga cccgcgacgg cggcaaggag atcagcaaca ccaccgagat cttccgcccc 13 8 0 
ggcggcggcg acatgcgcga caactggcgc agcgagctgt acaagtacaa ggtggtgaag 144 0 
atcgagcccc tgggcgtggc ccccaccaag gccaagcgcc gcgtggtgca gcgcgagaag 1500 
cgcgccgtga ccctgggcgc catgttcctg ggcttcctgg gcgccgccgg cagcaccatg 1560 
ggcgcccgca gcctgaccct gaccgtgcag gcccgccagc tgctgagcgg catcgtgcag 162 0 
cagcagaaca acctgctgcg cgccatcgag gcccagcagc acctgctgca gctgaccgtg 1680 
tggggcatca agcagctgca ggcccgcgtg ctggccgtgg agcgctacct gaaggaccag 174 0 
cagctgctgg gcatctgggg ctgcagcggc aagctgatct gcaccaccgc cgtgccctgg 1800 
aacgccagct ggagcaacaa gagcctggac cagatctgga acaacatgac ctggatggag 1860 
tgggagcgcg agatcgacaa ctacaccaac ctgatctaca ccctgatcga ggagagccag 192 0 
aaccagcagg agaagaacga gcaggagctg ctggagctgg acaagtgggc cagcctgtgg 1980 
aactggttcg acatcagcaa gtggctgtgg tacatcaaga tcttcatcat gatcgtgggc 204 0 
ggcctggtgg gcctgcgcat cgtgttcacc gtgctgagca tcgtgaaccg cgtgcgccag 2100 
ggctacagcc ccctgagctt ccagacccgc ttccccgccc cccgcggccc cgaccgcccc 2160 
gagggcatcg aggaggaggg cggcgagcgc gaccgcgacc gcagcagccc cctggtgcac 2220 
ggcctgctgg ccctgatctg ggacgacctg cgcagcctgt gcctgttcag ctaccaccgc 22 80 
ctgcgcgacc tgatcctgat cgccgcccgc atcgtggagc tgctgggccg ccgcggctgg 2 34 0 
gaggccctga agtactgggg caacctgctg cagtactgga tccaggagct gaagaacagc 2400 
gccgtgagcc tgttcgacgc catcgccatc gccgtggccg agggcaccga ccgcatcatc 2460 
gaggtggccc agcgcatcgg ccgcgccttc ctgcacatcc cccgccgcat ccgccagggc 252 0 
ttcgagcgcg ccctgctgta actcgag 2547 

<210> 49 
<211> 2466 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Description of Artificial Sequence: 
gp!60 .modSF162 . delV2 



<400> 49 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
acccccctgt gcgtgaccct gcactgcacc 
agcaactgga aggagatgga ccgcggcgag 
ggcaagctga tcaactgcaa caccagcgtg 
gagcccatcc ccatccacta ctgcgccccc 
aagaagttca acggcagcgg cccctgcacc 
atccgccccg tggtgagcac ccagctgctg 
gtgatccgca gcgagaactt caccgacaac 
agcgtggaga tcaactgcac ccgccccaac 
cccggccgcg ccttctacgc caccggcgac 
aacatcagcg gcgagaagtg gaacaacacc 
cagttcggca acaagaccat cgtgttcaag 
atgcacagct tcaactgcgg cggcgagttc 
agcacctgga acaacaccat cggccccaac 
cgcatcaagc agatcatcaa ccgctggcag 
atccgcggcc agatccgctg cagcagcaac 
ggcaaggaga tcagcaacac caccgagatc 
aactggcgca gcgagctgta caagtacaag 
cccaccaagg ccaagcgccg cgtggtgcag 
atgttcctgg gcttcctggg cgccgccggc 
accgtgcagg cccgccagct gctgagcggc 
gccatcgagg cccagcagca cctgctgcag 
gcccgcgtgc tggccgtgga gcgctacctg 
tgcagcggca agctgatctg caccaccgcc 
agcctggacc agatctggaa caacatgacc 
tacaccaacc tgatctacac cctgatcgag 
caggagctgc tggagctgga caagtgggcc 
tggctgtggt acatcaagat cttcatcatg 
gtgttcaccg tgctgagcat cgtgaaccgc 
cagacccgct tccccgcccc ccgcggcccc 
ggcgagcgcg accgcgaccg cagcagcccc 
gacgacctgc gcagcctgtg cctgttcagc 
gccgcccgca tcgtggagct gctgggccgc 
aacctgctgc agtactggat ccaggagctg 
atcgccatcg ccgtggccga gggcaccgac 
cgcgccttcc tgcacatccc ccgccgcatc 
ctcgag 

<210> 50 
<211> 2358 
<212> DNA 

<213> Artificial Sequence 



gggctctgct gtgtgctgct gctgtgtgga 60 
aagctgtggg tgaccgtgta ctacggcgtg 120 
ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 3 00 
gaccagagcc tgaagccctg cgtgaagctg 360 
aacctgaaga acgccaccaa caccaagagc 420 r 
atcaagaact gcagcttcaa ggtgggcgcc 480 
atcacccagg cctgccccaa ggtgagcttc 540 
gccggcttcg ccatcctgaa gtgcaacgac 60 0 
aacgtgagca ccgtgcagtg cacccacggc 660 
ctgaacggca gcctggccga ggagggcgtg 72 0 
gccaagacca tcatcgtgca gctgaaggag 780 
aacaacaccc gcaagagcat caccatcggc 840 
atcatcggcg acatccgcca ggcccactgc 900 
ctgaagcaga tcgtgaccaa gctgcaggcc 960 
cagagcagcg gcggcgaccc cgagatcgtg 102 0 
ttctactgca acagcaccca gctgttcaac 1080 
aacaccaacg gcaccatcac cctgccctgc 1140 
gaggtgggca aggccatgta cgcccccccc 12 00 
atcaccggcc tgctgctgac ccgcgacggc 12 60 
ttccgccccg gcggcggcga catgcgcgac 1320 
gtggtgaaga tcgagcccct gggcgtggcc 13 8 0 
cgcgagaagc gcgccgtgac cctgggcgcc 1440 
agcaccatgg gcgcccgcag cctgaccctg 15 00 
atcgtgcagc agcagaacaa cctgctgcgc 1560 
ctgaccgtgt ggggcatcaa gcagctgcag 1620 
aaggaccagc agctgctggg catctggggc 168 0 
gtgccctgga acgccagctg gagcaacaag 1740 
tggatggagt gggagcgcga gatcgacaac 1800 
gagagccaga accagcagga gaagaacgag 1860 
agcctgtgga actggttcga catcagcaag 1920 
atcgtgggcg gcctggtggg cctgcgcatc 198 0 
gtgcgccagg gctacagccc cctgagcttc 2040 
gaccgccccg agggcatcga ggaggagggc 2100 
ctggtgcacg gcctgctggc cctgatctgg 2160 
taccaccgcc tgcgcgacct gatcctgatc 222 0 
cgcggctggg aggccctgaa gtactggggc 22 80 
aagaacagcg ccgtgagcct gttcgacgcc 2340 
cgcatcatcg aggtggccca gcgcatcggc 24 00 
cgccagggct tcgagcgcgc cctgctgtaa 2460 

2466 



<220> 

<223> Description of Artificial Sequence: 



gpl60 .modSF162 .delVl/V2 



<400> 50 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 3 60 
acccccctgt gcgtgggcgc cggcaactgc cagaccagcg tgatcaccca ggcctgcccc 420 
aaggtgagct tcgagcccat ccccatccac tactgcgccc ccgccggctt cgccatcctg 480 r 
aagtgcaacg acaagaagtt caacggcagc ggcccctgca ccaacgtgag caccgtgcag 540 
tgcacccacg gcatccgccc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 600 
gaggagggcg tggtgatccg cagcgagaac ttcaccgaca acgccaagac catcatcgtg 660 
cagctgaagg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 72 0 
atcaccatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 780 
caggcccact gcaacatcag cggcgagaag tggaacaaca ccctgaagca gatcgtgacc 840 
aagctgcagg cccagttcgg caacaagacc atcgtgttca agcagagcag cggcggcgac 900 
cccgagatcg tgatgcacag cttcaactgc ggcggcgagt tcttctactg caacagcacc 960 
cagctgttca acagcacctg gaacaacacc atcggcccca acaacaccaa cggcaccatc 102 0 
accctgccct gccgcatcaa gcagatcatc aaccgctggc aggaggtggg caaggccatg 108 0 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1140 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 12 00 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 12 60 
ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gcgcgccgtg 1320 
accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 13 80 
agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 1440 
aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 1500 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1560 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 162 0 
tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1680 
gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1740 
gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 180 0 
gacatcagca agtggctgtg gtacatcaag atcttcatca tgatcgtggg cggcctggtg 1860 
ggcctgcgca tcgtgttcac cgtgctgagc atcgtgaacc gcgtgcgcca gggctacagc 192 0 
cccctgagct tccagacccg cttccccgcc ccccgcggcc ccgaccgccc cgagggcatc 1980 
gaggaggagg gcggcgagcg cgaccgcgac cgcagcagcc ccctggtgca cggcctgctg 2 040 
gccctgatct gggacgacct gcgcagcctg tgcctgttca gctaccaccg cctgcgcgac 2100 
ctgatcctga tcgccgcccg catcgtggag ctgctgggcc gccgcggctg ggaggccctg 2160 
aagtactggg gcaacctgct gcagtactgg atccaggagc tgaagaacag cgccgtgagc 222 0 
ctgttcgacg ccatcgccat cgccgtggcc gagggcaccg accgcatcat cgaggtggcc 2280 
cagcgcatcg gccgcgcctt cctgcacatc ccccgccgca tccgccaggg cttcgagcgc 2340 
gccctgctgt aactcgag 2358 

<210> 51 
<211> 1494 
<212> DNA 

<213> Human immunodeficiency virus 



<400> 51 

acaacagtct tgtgggtcac agtctattat 
actctgtttt gtgcatcaga tgctaaagca 
acacatgcct gtgtacccac agaccccaac 
gaaaatttta acatgtggaa aaataacatg 



ggggtacctg tgtggaaaga agcaaccacc 60 
tacaaagcag aggcacataa cgtctgggct 120 
ccacaggaag taaatttaac aaatgtgaca 180 
gtggaacaga tgcatgagga tataatcagt 240 
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ttatgggatc aaagcctaaa gccatgtgta aaattaaccc cactctgtgt tactttaaat 3 00 
tgtactgata agttgacagg tagtactaat ggcacaaata gtactagtgg cactaatagt 360 
actagtggca ctaatagtac tagtactaat agtactgata gttgggaaaa gatgccagaa 420 
ggagaaataa aaaactgctc tttcaatatc accacaagtg taagagataa agtgcagaaa 480 
gaatattctc tcttctataa acttgatgta gtaccaatag ataatgataa tgctagctat 540 
agattgataa attgtaatac ctcagtcatt acacaagcct gtccaaaggt atcttttgaa 60 0 
ccaattccca tacattattg tgccccggct ggttttgcga ttctaaagtg taaagataag 660 
aagttcaatg gaacaggacc atgtaaaaat gtcagcacag tacaatgcac acatggaatt 72 0 
agaccagtag tatcaactca actgctgtta aatggcagtc tagcagaaga agagatagta 780 
cttagatctg aaaatttcac agacaatgct aaaaccataa tagtacagct gaatgaatct 840 
gtagaaatta attgtataag acccaacaat aatacaagaa aaagtataca tataggacca 900 r 
gggagagcat tttatgcaac aggtgatata ataggagaca taagacaagc acattgtaac 960 
attagtaaag caaactggac taacacttta gaacagatag ttgaaaaatt aagagaacaa 102 0 
tttgggaata ataaaacaat aatctttaat tcatcctcag gaggggaccc agaaattgta 1080 
tttcacagtt ttaattgtgg aggggaattt ttctattgta atacatcaca actatttaat 114 0 
agtacctgga atattactga agaggtaaat aagactaaag aaaatgacac tatcatactc 12 00 
ccatgcagaa taagacaaat tataaacatg tggcaagaag taggaaaagc aatgtatgcc 1260 
cctcccatca gaggacaaat taaatgttca tcaaatatta cagggctgct attaactaga 1320 
gatggtggta ctaacaataa taggacgaac gacaccgaga ccttcagacc tgggggagga 13 80 
aacatgaagg acaattggag aagtgaatta tataaatata aagtagtaag aattgaacca 1440 
ttaggagtag cacccaccca ggcaaagaga agagtggtgc aaagagagaa aaga 1494 

<210> 52 
<211> 2007 
<212> DNA 

<213> Human immunodeficiency virus 



<400> 52 

acaacagtct tgtgggtcac agtctattat 
actctgtttt gtgcatcaga tgctaaagca 
acacatgcct gtgtacccac agaccccaac 
gaaaatttta acatgtggaa aaataacatg 
ttatgggatc aaagcctaaa gccatgtgta 
tgtactgata agttgacagg tagtactaat 
actagtggca ctaatagtac tagtactaat 
ggagaaataa aaaactgctc tttcaatatc 
gaatattctc tcttctataa acttgatgta 
agattgataa attgtaatac ctcagtcatt 
ccaattccca tacattattg tgccccggct 
aagttcaatg gaacaggacc atgtaaaaat 
agaccagtag tatcaactca actgctgtta 
cttagatctg aaaatttcac agacaatgct 
gtagaaatta attgtataag acccaacaat 
gggagagcat tttatgcaac aggtgatata 
attagtaaag caaactggac taacacttta 
tttgggaata ataaaacaat aatctttaat 
tttcacagtt ttaattgtgg aggggaattt 
agtacctgga atattactga agaggtaaat 
ccatgcagaa taagacaaat tataaacatg 
cctcccatca gaggacaaat taaatgttca 
gatggtggta ctaacaataa taggacgaac 
aacatgaagg acaattggag aagtgaatta 
ttaggagtag cacccaccca ggcaaagaga 
ggactaggag ctttgttcat tgggttcttg 



ggggtacctg tgtggaaaga agcaaccacc 60 
tacaaagcag aggcacataa cgtctgggct 12 0 
ccacaggaag taaatttaac aaatgtgaca 180 
gtggaacaga tgcatgagga tataatcagt 240 
aaattaaccc cactctgtgt tactttaaat 300 
ggcacaaata gtactagtgg cactaatagt 360 
agtactgata gttgggaaaa gatgccagaa 42 0 
accacaagtg taagagataa agtgcagaaa 4 80 
gtaccaatag ataatgataa tgctagctat 540 
acacaagcct gtccaaaggt atcttttgaa 600 
ggttttgcga ttctaaagtg taaagataag 660 
gtcagcacag tacaatgcac acatggaatt 72 0 
aatggcagtc tagcagaaga agagatagta 7 80 
aaaaccataa tagtacagct gaatgaatct 840 
aatacaagaa aaagtataca tataggacca 900 
ataggagaca taagacaagc acattgtaac 960 
gaacagatag ttgaaaaatt aagagaacaa 102 0 
tcatcctcag gaggggaccc agaaattgta 108 0 
ttctattgta atacatcaca actatttaat 1140 
aagactaaag aaaatgacac tatcatactc 12 00 
tggcaagaag taggaaaagc aatgtatgcc 12 60 
tcaaatatta cagggctgct attaactaga 13 20 
gacaccgaga ccttcagacc tgggggagga 13 8 0 
tataaatata aagtagtaag aattgaacca 1440 
agagtggtgc aaagagagaa aagagcagtg 15 0 0 
ggagcagcag gaagcactat gggcgcagcg 1560 
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tcagtgacgc tgacggtaca ggccagacaa ttattgtctg gtatagtgca acagcagaac 1620 
aatttgctga gagctattga ggcgcaacag catctgttgc aactcacggt ctggggcatc 1680 
aaacagctcc aggcaagaat cctggctgtg gaaagatacc taaaggatca acagctccta 1740 
gggatttggg gttgctctgg aaaactcatt tgcaccacta ctgtgccttg gaactctagt 1800 
tggagtaata aatctctgac tgagatttgg gataatatga cctggatgga gtgggaaaga 1860 
gaaattggca attatacagg cttaatatac aatttaattg aaatagcaca aaaccagcaa 192 0 
gaaaagaatg aacaagaatt attggaatta gacaagtggg caagtttgtg gaattggttt 1980 
gatataacaa actggctgtg gtatata 2 007 

<210> 53 
<211> 2532 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 53 

acaacagtct tgtgggtcac agtctattat ggggtacctg tgtggaaaga agcaaccacc 60 
actctgtttt gtgcatcaga tgctaaagca tacaaagcag aggcacataa cgtctgggct 12 0 
acacatgcct gtgtacccac agaccccaac ccacaggaag taaatttaac aaatgtgaca 180 
gaaaatttta acatgtggaa aaataacatg gtggaacaga tgcatgagga tataatcagt 24 0 
ttatgggatc aaagcctaaa gccatgtgta aaattaaccc cactctgtgt tactttaaat 3 00 
tgtactgata agttgacagg tagtactaat ggcacaaata gtactagtgg cactaatagt 360 
actagtggca ctaatagtac tagtactaat agtactgata gttgggaaaa gatgccagaa 42 0 
ggagaaataa aaaactgctc tttcaatatc accacaagtg taagagataa agtgcagaaa 480 
gaatattctc tcttctataa acttgatgta gtaccaatag ataatgataa tgctagctat 540 
agattgataa attgtaatac ctcagtcatt acacaagcct gtccaaaggt atcttttgaa 600 
ccaattccca tacattattg tgccccggct ggttttgcga ttctaaagtg taaagataag 660 
aagttcaatg gaacaggacc atgtaaaaat gtcagcacag tacaatgcac acatggaatt 72 0 
agaccagtag tatcaactca actgctgtta aatggcagtc tagcagaaga agagatagta 780 
cttagatctg aaaatttcac agacaatgct aaaaccataa tagtacagct gaatgaatct 840 
gtagaaatta attgtataag acccaacaat aatacaagaa aaagtataca tataggacca 900 
gggagagcat tttatgcaac aggtgatata ataggagaca taagacaagc acattgtaac 960 
attagtaaag caaactggac taacacttta gaacagatag ttgaaaaatt aagagaacaa 102 0 
tttgggaata ataaaacaat aatctttaat tcatcctcag gaggggaccc agaaattgta 1080 
tttcacagtt ttaattgtgg aggggaattt ttctattgta atacatcaca actatttaat 114 0 
agtacctgga atattactga agaggtaaat aagactaaag aaaatgacac tat cat act c 1200 
ccatgcagaa taagacaaat tataaacatg tggcaagaag taggaaaagc aatgtatgcc 1260 
cctcccatca gaggacaaat taaatgttca tcaaatatta cagggctgct attaactaga 132 0 
gatggtggta ctaacaataa taggacgaac gacaccgaga ccttcagacc tgggggagga 13 8 0 
aacatgaagg acaattggag aagtgaatta tataaatata aagtagtaag aattgaacca 1440 
ttaggagtag cacccaccca ggcaaagaga agagtggtgc aaagagagaa aagagcagtg 1500 
ggactaggag ctttgttcat tgggttcttg ggagcagcag gaagcactat gggcgcagcg 1560 
tcagtgacgc tgacggtaca ggccagacaa ttattgtctg gtatagtgca acagcagaac 162 0 
aatttgctga gagctattga ggcgcaacag catctgttgc aactcacggt ctggggcatc 1680 
aaacagctcc aggcaagaat cctggctgtg gaaagatacc taaaggatca acagctccta 1740 
gggatttggg gttgctctgg aaaactcatt tgcaccacta ctgtgccttg gaactctagt 1800 
tggagtaata aatctctgac tgagatttgg gataatatga cctggatgga gtgggaaaga 1860 
gaaattggca attatacagg cttaatatac aatttaattg aaatagcaca aaaccagcaa 192 0 
gaaaagaatg aacaagaatt attggaatta gacaagtggg caagtttgtg gaattggttt 1980 
gatataacaa actggctgtg gtatataaga atattcataa tgatagtagg aggcttgata 2 040 
ggtttaagaa tagtttttgc tgtactttct atagtgaata gagttaggca gggatactca 210 0 
ccaatatcat tgcagacccg cctcccagct cagaggggac ccgacaggcc cgaaggaatc 2160 
gaagaagaag gtggagagag agacagagac agatccaatc gattagtgca tggattattg 222 0 
gcactcatct gggacgatct gcggagcctg tgcctcttca gctaccaccg cttgagagac 22 80 
ttactcttga ttgtagcgag gattgtggaa cttctgggac gcagggggtg ggaagccctc 2340 
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aagtattggt ggaatctcct gcagtattgg agtcaggagc taaagagtag tgctgttagt 2400 
ttgtttaatg ccacagcaat agcagtagct gaagggacag ataggattat agaaatagta 2460 
caaagaattt ttagagctgt aattcacata cctagaagaa taagacaggg cttggagagg 2520 
gctttactat aa 2532 

<210> 54 
<211> 1599 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: gpl20,modUS4 
<400> 54 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gaactgcacc gacaagctga ccggcagcac caacggcacc 42 0 
aacagcacca gcggcaccaa cagcaccagc ggcaccaaca gcaccagcac caacagcacc 48 0 
gacagctggg agaagatgcc cgagggcgag atcaagaact gcagcttcaa catcaccacc 540 
agcgtgcgcg acaaggtgca gaaggagtac agcctgttct acaagctgga cgtggtgccc 600 
atcgacaacg acaacgccag ctaccgcctg atcaactgca acaccagcgt gatcacccag 660 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 72 0 
gccatcctga agtgcaagga caagaagttc aacggcaccg gcccctgcaa gaacgtgagc 78 0 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 84 0 
agcctggccg aggaggagat cgtgctgcgc tccgagaact tcaccgacaa cgccaagacc 900 
atcatcgtgc agctgaacga gtccgtggag atcaactgca tccgccccaa caacaacacg 960 
cgtaagagca tccacatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 102 0 
gacatccgcc aggcccactg caacatcagc aaggccaact ggaccaacac cctcgagcag 1080 
atcgtggaga agctgcgcga gcagttcggc aacaacaaga ccatcatctt caacagcagc 1140 
agcggcggcg accccgagat cgtgttccac agcttcaact gcggcggcga gttcttctac 1200 
tgcaacacca gccagctgtt caacagcacc tggaacatca ccgaggaggt gaacaagacc 1260 
aaggagaacg acaccatcat cctgccctgc cgcatccgcc agatcatcaa catgtggcag 1320 
gaggtgggca aggccatgta cgcccccccc atccgcggcc agatcaagtg cagcagcaat 13 80 
attaccggcc tgctgctgac ccgcgacggc ggcaccaaca acaaccgcac caacgacacc 1440 
gagaccttcc gccccggcgg cggcaacatg aaggacaact ggcgcagcga gctgtacaag 1500 
tacaaggtgg tgcgcatcga gcccctgggc gtggccccca cccaggccaa gcgccgcgtg 1560 
gtgcagcgcg agaagcgcta agatatcgga tcctctaga 15 99 

<210> 55 
<211> 1350 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl2 0.modUS4.del 12 8-194 

<400> 55 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 12 0 
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cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggggc agggaactgc gagaccagcg tgatcaccca ggcctgcccc 420 
aaggtgagct tcgagcccat ccccatccac tactgcgccc ccgccggctt cgccatcctg 480 
aagtgcaagg acaagaagtt caacggcacc ggcccctgca agaacgtgag caccgtgcag 54 0 
tgcacccacg gcatccgccc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 600 
gaggaggaga tcgtgctgcg ctccgagaac ttcaccgaca acgccaagac catcatcgtg 660 
cagctgaacg agtccgtgga gatcaactgc atccgcccca acaacaacac gcgtaagagc 72 0 
atccacatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 780 . 
caggcccact gcaacatcag caaggccaac tggaccaaca ccctcgagca gatcgtggag 840 
aagctgcgcg agcagttcgg caacaacaag accatcatct tcaacagcag cagcggcggc 900 
gaccccgaga tcgtgttcca cagcttcaac tgcggcggcg agttcttcta ctgcaacacc 960 
agccagctgt tcaacagcac ctggaacatc accgaggagg tgaacaagac caaggagaac 1020 
gacaccatca tcctgccctg ccgcatccgc cagatcatca acatgtggca ggaggtgggc 1080 
aaggccatgt acgccccccc catccgcggc cagatcaagt gcagcagcaa tattaccggc 1140 
ctgctgctga cccgcgacgg cggcaccaac aacaaccgca ccaacgacac cgagaccttc 1200 
cgccccggcg gcggcaacat gaaggacaac tggcgcagcg agctgtacaa gtacaaggtg 1260 
gtgcgcatcg agcccctggg cgtggccccc acccaggcca agcgccgcgt ggtgcagcgc 1320 
gagaagcgct aagatatcgg atcctctaga 1350 

<210> 56 
<211> 2112 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: gpl40 ,modUS4 
<400> 56 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gaactgcacc gacaagctga ccggcagcac caacggcacc 420 
aacagcacca gcggcaccaa cagcaccagc ggcaccaaca gcaccagcac caacagcacc 4 80 
gacagctggg agaagatgcc cgagggcgag atcaagaact gcagcttcaa catcaccacc 540 
agcgtgcgcg acaaggtgca gaaggagtac agcctgttct acaagctgga cgtggtgccc 600 
atcgacaacg acaacgccag ctaccgcctg atcaactgca acaccagcgt gatcacccag 660 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 720 
gccatcctga agtgcaagga caagaagttc aacggcaccg gcccctgcaa gaacgtgagc 780 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 84 0 
agcctggccg aggaggagat cgtgctgcgc tccgagaact tcaccgacaa cgccaagacc 900 
atcatcgtgc agctgaacga gtccgtggag atcaactgca tccgccccaa caacaacacg 960 
cgtaagagca tccacatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 1020 
gacatccgcc aggcccactg caacatcagc aaggccaact ggaccaacac cctcgagcag 1080 
atcgtggaga agctgcgcga gcagttcggc aacaacaaga ccatcatctt caacagcagc 1140 
agcggcggcg accccgagat cgtgttccac agcttcaact gcggcggcga gttcttctac 1200 
tgcaacacca gccagctgtt caacagcacc tggaacatca ccgaggaggt gaacaagacc 1260 
aaggagaacg acaccatcat cctgccctgc cgcatccgcc agatcatcaa catgtggcag 132 0 
gaggtgggca aggccatgta cgcccccccc atccgcggcc agatcaagtg cagcagcaat 13 80 
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attaccggcc tgctgctgac ccgcgacggc ggcaccaaca acaaccgcac caacgacacc 1440 
gagaccttcc gccccggcgg cggcaacatg aaggacaact ggcgcagcga gctgtacaag 1500 
tacaaggtgg tgcgcatcga gcccctgggc gtggccccca cccaggccaa gcgccgcgtg 1560 
gtgcagcgcg agaagcgcgc cgtgggcctg ggcgccctgt tcatcggctt cctgggcgcc 162 0 
gccgggagca ccatgggcgc cgcctccgtg accctgaccg tgcaggcccg ccagctgctg 1680 
agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 174 0 
ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcatcctggc cgtggagcgc 1800 
tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1860 
accaccgtgc cctggaacag cagctggagc aacaagagcc tgaccgagat ctgggacaac 192 0 
atgacctgga tggagtggga gcgcgagatc ggcaactaca ccggcctgat ctacaacctg 1980 
atcgagatcg cccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 2 04 0 
tgggccagcc tgtggaactg gttcgacatc accaactggc tgtggtacat ctaagatatc 2100 
ggatcctcta ga 2112 

<210> 57 
<211> 2112 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl4 0 . mut . modUS4 

<400> 57 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gaactgcacc gacaagctga ccggcagcac caacggcacc 420 
aacagcacca gcggcaccaa cagcaccagc ggcaccaaca gcaccagcac caacagcacc 480 
gacagctggg agaagatgcc cgagggcgag atcaagaact gcagcttcaa catcaccacc 540 
agcgtgcgcg acaaggtgca gaaggagtac agcctgttct acaagctgga cgtggtgccc 600 
atcgacaacg acaacgccag ctaccgcctg atcaactgca acaccagcgt gatcacccag 660 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 72 0 
gccatcctga agtgcaagga caagaagttc aacggcaccg gcccctgcaa gaacgtgagc 780 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 840 
agcctggccg aggaggagat cgtgctgcgc tccgagaact tcaccgacaa cgccaagacc 900 
atcatcgtgc agctgaacga gtccgtggag atcaactgca tccgccccaa caacaacacg 960 
cgtaagagca tccacatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 1020 
gacatccgcc aggcccactg caacatcagc aaggccaact ggaccaacac cctcgagcag 108 0 
atcgtggaga agctgcgcga gcagttcggc aacaacaaga ccatcatctt caacagcagc 114 0 
agcggcggcg accccgagat cgtgttccac agcttcaact gcggcggcga gttcttctac 12 00 
tgcaacacca gccagctgtt caacagcacc tggaacatca ccgaggaggt gaacaagacc 12 60 
aaggagaacg acaccatcat cctgccctgc cgcatccgcc agatcatcaa catgtggcag 13 2 0 
gaggtgggca aggccatgta cgcccccccc atccgcggcc agatcaagtg cagcagcaat 13 8 0 
attaccggcc tgctgctgac ccgcgacggc ggcaccaaca acaaccgcac caacgacacc 144 0 
gagaccttcc gccccggcgg cggcaacatg aaggacaact ggcgcagcga gctgtacaag 15 00 
tacaaggtgg tgcgcatcga gcccctgggc gtggccccca cccaggccaa gcgccgcgtg 1560 
gtgcagcgcg agaagagcgc cgtgggcctg ggcgccctgt tcatcggctt cctgggcgcc 162 0 
gccgggagca ccatgggcgc cgcctccgtg accctgaccg tgcaggcccg ccagctgctg 168 0 
agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 174 0 
ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcatcctggc cgtggagcgc 1800 
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tacctgaagg accagcagct gctgggcatc 
accaccgtgc cctggaacag cagctggagc 
atgacctgga tggagtggga gcgcgagatc 
atcgagatcg cccagaacca gcaggagaag 
tgggccagcc tgtggaactg gttcgacatc 
ggatcctcta ga 

<210> 58 
<211> 2181 
<212> DNA 

<213> Artificial Sequence 



tggggctgca gcggcaagct gatctgcacc 1860 
aacaagagcc tgaccgagat ctgggacaac 192 0 
ggcaactaca ccggcctgat ctacaacctg 1980 
aacgagcagg agctgctgga gctggacaag 204 0 
accaactggc tgtggtacat ctaagatatc 2100 

2112 



<220> 

<223> Description of Artificial Sequence: gpl40TM.modUS4 
<400> 58 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 3 60 
acccccctgt gcgtgaccct gaactgcacc gacaagctga ccggcagcac caacggcacc 420 
aacagcacca gcggcaccaa cagcaccagc ggcaccaaca gcaccagcac caacagcacc 480 
gacagctggg agaagatgcc cgagggcgag atcaagaact gcagcttcaa catcaccacc 54 0 
agcgtgcgcg acaaggtgca gaaggagtac agcctgttct acaagctgga cgtggtgccc 600 
atcgacaacg acaacgccag ctaccgcctg atcaactgca acaccagcgt gatcacccag 660 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 720 
gccatcctga agtgcaagga caagaagttc aacggcaccg gcccctgcaa gaacgtgagc 780 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 84 0 
agcctggccg aggaggagat cgtgctgcgc tccgagaact tcaccgacaa cgccaagacc 900 
atcatcgtgc agctgaacga gtccgtggag atcaactgca tccgccccaa caacaacacg 960 
cgtaagagca tccacatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 1020 
gacatccgcc aggcccactg caacatcagc aaggccaact ggaccaacac cctcgagcag 1080 
atcgtggaga agctgcgcga gcagttcggc aacaacaaga ccatcatctt caacagcagc 1140 
agcggcggcg accccgagat cgtgttccac agcttcaact gcggcggcga gttcttctac 12 00 
tgcaacacca gccagctgtt caacagcacc tggaacatca ccgaggaggt gaacaagacc 12 60 
aaggagaacg acaccatcat cctgccctgc cgcatccgcc agatcatcaa catgtggcag 132 0 
gaggtgggca aggccatgta cgcccccccc atccgcggcc agatcaagtg cagcagcaat 13 8 0 
attaccggcc tgctgctgac ccgcgacggc ggcaccaaca acaaccgcac caacgacacc 144 0 
gagaccttcc gccccggcgg cggcaacatg aaggacaact ggcgcagcga gctgtacaag 15 00 
tacaaggtgg tgcgcatcga gcccctgggc gtggccccca cccaggccaa gcgccgcgtg 1560 
gtgcagcgcg agaagcgcgc cgtgggcctg ggcgccctgt tcatcggctt cctgggcgcc 162 0 
gccgggagca ccatgggcgc cgcctccgtg accctgaccg tgcaggcccg ccagctgctg 168 0 
agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 174 0 
ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcatcctggc cgtggagcgc 18 00 
tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1860 
accaccgtgc cctggaacag cagctggagc aacaagagcc tgaccgagat ctgggacaac 192 0 
atgacctgga tggagtggga gcgcgagatc ggcaactaca ccggcctgat ctacaacctg 198 0 
atcgagatcg cccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 204 0 
tgggccagcc tgtggaactg gttcgacatc accaactggc tgtggtacat ccgcatcttc 2100 
atcatgatcg tgggcggcct gatcggcctg cgcatcgtgt tcgccgtgct gagcatcgtg 2160 
taagatatcg gatcctctag a 2181 
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<210> 59 
<211> 1818 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl40 ,modUS4 .delVl/V2 

<400> 59 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 r 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcgcc 3 60 
ggccaggcct gccccaaggt gagcttcgag cccatcccca tccactactg cgcccccgcc 42 0 
ggcttcgcca tcctgaagtg caaggacaag aagttcaacg gcaccggccc ctgcaagaac 480 
gtgagcaccg tgcagtgcac ccacggcatc cgccccgtgg tgagcaccca gctgctgctg 540 
:?^ : aacggcagcc tggccgagga ggagatcgtg ctgcgctccg agaacttcac cgacaacgcc 600 
% aagaccatca tcgtgcagct gaacgagtcc gtggagatca actgcatccg ccccaacaac 660 
■|f aacacgcgta agagcatcca catcggcccc ggccgcgcct tctacgccac cggcgacatc 72 0 
1% atcggcgaca tccgccaggc ccactgcaac atcagcaagg ccaactggac caacaccctc 780 
?i gagcagatcg tggagaagct gcgcgagcag ttcggcaaca acaagaccat catcttcaac 840 
agcagcagcg gcggcgaccc cgagatcgtg ttccacagct tcaactgcgg cggcgagttc 900 
111 ttctactgca acaccagcca gctgttcaac agcacctgga acatcaccga ggaggtgaac 960 
M aagaccaagg agaacgacac catcatcctg ccctgccgca tccgccagat catcaacatg 102 0 
111 tggcaggagg tgggcaaggc catgtacgcc ccccccatcc gcggccagat caagtgcagc 1080 
J agcaatatta ccggcctgct gctgacccgc gacggcggca ccaacaacaa ccgcaccaac 1140 
jL gacaccgaga ccttccgccc cggcggcggc aacatgaagg acaactggcg cagcgagctg 12 00 
sTs tacaagtaca aggtggtgcg catcgagccc ctgggcgtgg cccccaccca ggccaagcgc 1260 
«!1 cgcgtggtgc agcgcgagaa gcgcgccgtg ggcctgggcg ccctgttcat cggcttcctg 132 0 
Sf ggcgccgccg ggagcaccat gggcgccgcc tccgtgaccc tgaccgtgca ggcccgccag 13 80 
y ctgctgagcg gcatcgtgca gcagcagaac aacctgctgc gcgccatcga ggcccagcag 144 0 
cacctgctgc agctgaccgt gtggggcatc aagcagctgc aggcccgcat cctggccgtg 1500 
tfj gagcgctacc tgaaggacca gcagctgctg ggcatctggg gctgcagcgg caagctgatc 1560 
tgcaccacca ccgtgccctg gaacagcagc tggagcaaca agagcctgac cgagatctgg 162 0 
gacaacatga cctggatgga gtgggagcgc gagatcggca actacaccgg cctgatctac 1680 
aacctgatcg agatcgccca gaaccagcag gagaagaacg agcaggagct gctggagctg 1740 
gacaagtggg ccagcctgtg gaactggttc gacat caeca actggctgtg gtacatctaa 1800 
gatateggat cctctaga 1818 

<210> 60 
<211> 2031 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl40 .modUS4.delV2 

<400> 60 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 
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cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gaactgcacc gacaagctga ccggcagcac caacggcacc 42 0 
aacagcacca gcggcaccaa cagcaccagc ggcaccaaca gcaccagcac caacagcacc 480 
gacagctggg agaagatgcc cgagggcgag atcaagaact gcagcttcaa catcggcgcc 54 0 
ggccgcctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 600 
gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaaggac 660 
aagaagttca acggcaccgg cccctgcaag aacgtgagca ccgtgcagtg cacccacggc 72 0 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggaggagatc 780 r 
gtgctgcgct ccgagaactt caccgacaac gccaagacca tcatcgtgca gctgaacgag 84 0 
tccgtggaga tcaactgcat ccgccccaac aacaacacgc gtaagagcat ccacatcggc 900 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 960 
aacatcagca aggccaactg gaccaacacc ctcgagcaga tcgtggagaa gctgcgcgag 102 0 
cagttcggca acaacaagac catcatcttc aacagcagca gcggcggcga ccccgagatc 10 80 
gtgttccaca gcttcaactg cggcggcgag ttcttctact gcaacaccag ccagctgttc 1140 
aacagcacct ggaacatcac cgaggaggtg aacaagacca aggagaacga caccatcatc 12 00 
ctgccctgcc gcatccgcca gatcatcaac atgtggcagg aggtgggcaa ggccatgtac 1260 
gcccccccca tccgcggcca gatcaagtgc agcagcaata ttaccggcct gctgctgacc 132 0 
cgcgacggcg gcaccaacaa caaccgcacc aacgacaccg agaccttccg ccccggcggc 13 80 
ggcaacatga aggacaactg gcgcagcgag ctgtacaagt acaaggtggt gcgcatcgag 144 0 
cccctgggcg tggcccccac ccaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 
gtgggcctgg gcgccctgtt catcggcttc ctgggcgccg ccgggagcac catgggcgcc 1560 
gcctccgtga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 162 0 
aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 
atcaagcagc tgcaggcccg catcctggcc gtggagcgct acctgaagga ccagcagctg 174 0 
ctgggcatct ggggctgcag cggcaagctg atctgcacca ccaccgtgcc ctggaacagc 1800 
agctggagca acaagagcct gaccgagatc tgggacaaca tgacctggat ggagtgggag 1860 
cgcgagatcg gcaactacac cggcctgatc tacaacctga tcgagatcgc ccagaaccag 192 0 
caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 198 0 
ttcgacatca ccaactggct gtggtacatc taagatatcg gatcctctag a 2 031 

<210> 61 
<211> 1818 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl40 .mut .modUS4 . delVl/V2 

<400> 61 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcgcc 3 60 
ggccaggcct gccccaaggt gagcttcgag cccatcccca tccactactg cgcccccgcc 42 0 
ggcttcgcca tcctgaagtg caaggacaag aagttcaacg gcaccggccc ctgcaagaac 4 80 
gtgagcaccg tgcagtgcac ccacggcatc cgccccgtgg tgagcaccca gctgctgctg 540 
aacggcagcc tggccgagga ggagatcgtg ctgcgctccg agaacttcac cgacaacgcc 600 
aagaccatca tcgtgcagct gaacgagtcc gtggagatca actgcatccg ccccaacaac 660 
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aacacgcgta agagcatcca catcggcccc 
atcggcgaca tccgccaggc ccactgcaac 
gagcagatcg tggagaagct gcgcgagcag 
agcagcagcg gcggcgaccc cgagatcgtg 
ttctactgca acaccagcca gctgttcaac 
aagaccaagg agaacgacac catcatcctg 
tggcaggagg tgggcaaggc catgtacgcc 
agcaatatta ccggcctgct gctgacccgc 
gacaccgaga ccttccgccc cggcggcggc 
tacaagtaca aggtggtgcg catcgagccc 
cgcgtggtgc agcgcgagaa gagcgccgtg 
ggcgccgccg ggagcaccat gggcgccgcc 
ctgctgagcg gcatcgtgca gcagcagaac 
cacctgctgc agctgaccgt gtggggcatc 
gagcgctacc tgaaggacca gcagctgctg 
tgcaccacca ccgtgccctg gaacagcagc 
gacaacatga cctggatgga gtgggagcgc 
aacctgatcg agatcgccca gaaccagcag 
gacaagtggg ccagcctgtg gaactggttc 
gatatcggat cctctaga 



ggccgcgcct tctacgccac cggcgacatc 720 
atcagcaagg ccaactggac caacaccctc 780 
ttcggcaaca acaagaccat catcttcaac 840 
ttccacagct tcaactgcgg cggcgagttc 900 
agcacctgga acatcaccga ggaggtgaac 960 
ccctgccgca tccgccagat catcaacatg 102 0 
ccccccatcc gcggccagat caagtgcagc 108 0 
gacggcggca ccaacaacaa ccgcaccaac 1140 
aacatgaagg acaactggcg cagcgagctg 1200 
ctgggcgtgg cccccaccca ggccaagcgc 1260 
ggcctgggcg ccctgttcat cggcttcctg 132Q 
tccgtgaccc tgaccgtgca ggcccgccag 13 80 
aacctgctgc gcgccatcga ggcccagcag 1440 
aagcagctgc aggcccgcat cctggccgtg 1500 
ggcatctggg gctgcagcgg caagctgatc 1560 
tggagcaaca agagcctgac cgagatctgg 162 0 
gagatcggca actacaccgg cctgatctac 1680 
gagaagaacg agcaggagct gctggagctg 174 0 
gacatcacca actggctgtg gtacatctaa 1800 

1818 



<210> 62 
<211> 1818 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
gpl40.modUS4.del 128-194 



<400> 62 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccaccacc 
cccgtgtgga aggaggccac caccaccctg 
gccgaggccc acaacgtgtg ggccacccac 
gaggtgaacc tgaccaacgt gaccgagaac 
cagatgcatg aggacatcat cagcctgtgg 
ggccaggcct gccccaaggt gagcttcgag 
ggcttcgcca tcctgaagtg caaggacaag 
gtgagcaccg tgcagtgcac ccacggcatc 
aacggcagcc tggccgagga ggagatcgtg 
aagaccatca tcgtgcagct gaacgagtcc 
aacacgcgta agagcatcca catcggcccc 
atcggcgaca tccgccaggc ccactgcaac 
gagcagatcg tggagaagct gcgcgagcag 
agcagcagcg gcggcgaccc cgagatcgtg 
ttctactgca acaccagcca gctgttcaac 
aagaccaagg agaacgacac catcatcctg 
tggcaggagg tgggcaaggc catgtacgcc 
agcaatatta ccggcctgct gctgacccgc 
gacaccgaga ccttccgccc cggcggcggc 
tacaagtaca aggtggtgcg catcgagccc 
cgcgtggtgc agcgcgagaa gagcgccgtg 
ggcgccgccg ggagcaccat gggcgccgcc 



gggctctgct gtgtgctgct gctgtgtgga 60 
gtgctgtggg tgaccgtgta ctacggcgtg 12 0 
ttctgcgcca gcgacgccaa ggcttacaag 18 0 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 3 00 
gaccagagcc tgaagccctg cgtgggcgcc 3 60 
cccatcccca tccactactg cgcccccgcc 420 
aagttcaacg gcaccggccc ctgcaagaac 480 
cgccccgtgg tgagcaccca gctgctgctg 540 
ctgcgctccg agaacttcac cgacaacgcc 600 
gtggagatca actgcatccg ccccaacaac 660 
ggccgcgcct tctacgccac cggcgacatc 72 0 
atcagcaagg ccaactggac caacaccctc 780 
ttcggcaaca acaagaccat catcttcaac 840 
ttccacagct tcaactgcgg cggcgagttc 90 0 
agcacctgga acatcaccga ggaggtgaac 960 
ccctgccgca tccgccagat catcaacatg 102 0 
ccccccatcc gcggccagat caagtgcagc 108 0 
gacggcggca ccaacaacaa ccgcaccaac 1140 
aacatgaagg acaactggcg cagcgagctg 12 00 
ctgggcgtgg cccccaccca ggccaagcgc 12 60 
ggcctgggcg ccctgttcat cggcttcctg 1320 
tccgtgaccc tgaccgtgca ggcccgccag 13 8 0 



ctgctgagcg gcatcgtgca gcagcagaac 
cacctgctgc agctgaccgt gtggggcatc 
gagcgctacc tgaaggacca gcagctgctg 
tgcaccacca ccgtgccctg gaacagcagc 
gacaacatga cctggatgga gtgggagcgc 
aacctgatcg agatcgccca gaaccagcag 
gacaagtggg ccagcctgtg gaactggttc 
gatatcggat cctctaga 

<210> 63 
<211> 1863 
<212> DNA 

<213> Artificial Sequence 



aacctgctgc gcgccatcga ggcccagcag 144 0 
aagcagctgc aggcccgcat cctggccgtg 15 00 
ggcatctggg gctgcagcgg caagctgatc 1560 
tggagcaaca agagcctgac cgagatctgg 1620 
gagatcggca actacaccgg cctgatctac 1680 
gagaagaacg agcaggagct gctggagctg 1740 
gacatcacca actggctgtg gtacatctaa 1800 

1818 



<220> 

<223> Description of Artificial Sequence: 
gpl40.mut.modUS4.del 128-194 



<400> 63 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccaccacc 
cccgtgtgga aggaggccac caccaccctg 
gccgaggccc acaacgtgtg ggccacccac 
gaggtgaacc tgaccaacgt gaccgagaac 
cagatgcatg aggacatcat cagcctgtgg 
acccccctgt gcgtgggggc agggaactgc 
aaggtgagct tcgagcccat ccccatccac 
aagtgcaagg acaagaagtt caacggcacc 
tgcacccacg gcatccgccc cgtggtgagc 
gaggaggaga tcgtgctgcg ctccgagaac 
cagctgaacg agtccgtgga gatcaactgc 
atccacatcg gccccggccg cgccttctac 
caggcccact gcaacatcag caaggccaac 
aagctgcgcg agcagttcgg caacaacaag 
gaccccgaga tcgtgttcca cagcttcaac 
agccagctgt tcaacagcac ctggaacatc 
gacaccatca tcctgccctg ccgcatccgc 
aaggccatgt acgccccccc catccgcggc 
ctgctgctga cccgcgacgg cggcaccaac 
cgccccggcg gcggcaacat gaaggacaac 
gtgcgcatcg agcccctggg cgtggccccc 
gagaagagcg ccgtgggcct gggcgccctg 
accatgggcg ccgcctccgt gaccctgacc 
gtgcagcagc agaacaacct gctgcgcgcc 
accgtgtggg gcatcaagca gctgcaggcc 
gaccagcagc tgctgggcat ctggggctgc 
ccctggaaca gcagctggag caacaagagc 
atggagtggg agcgcgagat cggcaactac 
gcccagaacc agcaggagaa gaacgagcag 
ctgtggaact ggttcgacat caccaactgg 
aga 

<210> 64 
<211> 2634 



gggctctgct gtgtgctgct gctgtgtgga 60 
gtgctgtggg tgaccgtgta ctacggcgtg 12 0 
ttctgcgcca gcgacgccaa ggcttacaag 180 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgaagctg 360 
gagaccagcg tgatcaccca ggcctgcccc 42 0 
tactgcgccc ccgccggctt cgccatcctg 4 80 
ggcccctgca agaacgtgag caccgtgcag 540 
acccagctgc tgctgaacgg cagcctggcc 600 
ttcaccgaca acgccaagac catcatcgtg 660 
atccgcccca acaacaacac gcgtaagagc 72 0 
gccaccggcg acatcatcgg cgacatccgc 78 0 
tggaccaaca ccctcgagca gatcgtggag 84 0 
accatcatct tcaacagcag cagcggcggc 900 
tgcggcggcg agttcttcta ctgcaacacc 960 
accgaggagg tgaacaagac caaggagaac 102 0 
cagatcatca acatgtggca ggaggtgggc 1080 
cagatcaagt gcagcagcaa tattaccggc 114 0 
aacaaccgca ccaacgacac cgagaccttc 12 0 0 
tggcgcagcg agctgtacaa gtacaaggtg 1260 
acccaggcca agcgccgcgt ggtgcagcgc 1320 
ttcatcggct tcctgggcgc cgccgggagc 13 80 
gtgcaggccc gccagctgct gagcggcatc 144 0 
atcgaggccc agcagcacct gctgcagctg 1500 
cgcatcctgg ccgtggagcg ctacctgaag 1560 
agcggcaagc tgatctgcac caccaccgtg 162 0 
ctgaccgaga tctgggacaa catgacctgg 168 0 
accggcctga tctacaacct gatcgagatc 1740 
gagctgctgg agctggacaa gtgggccagc 1800 
ctgtggtaca tctaagatat cggatcctct 1860 

1863 



40 



<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: gpl60 .modUS4 
<400> 64 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240, 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 30 0 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gaactgcacc gacaagctga ccggcagcac caacggcacc 420 
aacagcacca gcggcaccaa cagcaccagc ggcaccaaca gcaccagcac caacagcacc 4 80 
gacagctggg agaagatgcc cgagggcgag atcaagaact gcagcttcaa catcaccacc 54 0 
agcgtgcgcg acaaggtgca gaaggagtac agcctgttct acaagctgga cgtggtgccc 600 
atcgacaacg acaacgccag ctaccgcctg atcaactgca acaccagcgt gatcacccag 660 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 72 0 
gccatcctga agtgcaagga caagaagttc aacggcaccg gcccctgcaa gaacgtgagc 780 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 84 0 
agcctggccg aggaggagat cgtgctgcgc tccgagaact tcaccgacaa cgccaagacc 900 
atcatcgtgc agctgaacga gtccgtggag atcaactgca tccgccccaa caacaacacg 960 
cgtaagagca tccacatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 102 0 
gacatccgcc aggcccactg caacatcagc aaggccaact ggaccaacac cctcgagcag 1080 
atcgtggaga agctgcgcga gcagttcggc aacaacaaga ccatcatctt caacagcagc 1140 
agcggcggcg accccgagat cgtgttccac agcttcaact gcggcggcga gttcttctac 1200 
tgcaacacca gccagctgtt caacagcacc tggaacatca ccgaggaggt gaacaagacc 1260 
aaggagaacg acaccatcat cctgccctgc cgcatccgcc agatcatcaa catgtggcag 132 0 
gaggtgggca aggccatgta cgcccccccc atccgcggcc agatcaagtg cagcagcaat 13 80 
attaccggcc tgctgctgac ccgcgacggc ggcaccaaca acaaccgcac caacgacacc 1440 
gagaccttcc gccccggcgg cggcaacatg aaggacaact ggcgcagcga gctgtacaag 150 0 
tacaaggtgg tgcgcatcga gcccctgggc gtggccccca cccaggccaa gcgccgcgtg 1560 
gtgcagcgcg agaagcgcgc cgtgggcctg ggcgccctgt tcatcggctt cctgggcgcc 162 0 
gccgggagca ccatgggcgc cgcctccgtg accctgaccg tgcaggcccg ccagctgctg 1680 
agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 174 0 
ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcatcctggc cgtggagcgc 180 0 
tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1860 
accaccgtgc cctggaacag cagctggagc aacaagagcc tgaccgagat ctgggacaac 1920 
atgacctgga tggagtggga gcgcgagatc ggcaactaca ccggcctgat ctacaacctg 1980 
atcgagatcg cccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 2040 
tgggccagcc tgtggaactg gttcgacatc accaactggc tgtggtacat ccgcatcttc 2100 
atcatgatcg tgggcggcct gatcggcctg cgcatcgtgt tcgccgtgct gagcatcgtg 2160 
aaccgcgtgc gccagggcta cagccccatc agcctgcaga cccgcctgcc cgcccagcgc 222 0 
ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 2280 
aaccgcctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg 234 0 
ttcagctacc accgcctgcg cgacctgctg ctgatcgtgg cccgcatcgt ggagctgctg 2400 
ggccgccgcg gctgggaggc cctgaagtac tggtggaacc tgctgcagta ctggagccag 2460 
gagctgaaga gcagcgccgt gagcctgttc aacgccaccg ccatcgccgt ggccgagggc 2 52 0 
accgaccgca tcatcgagat cgtgcagcgc atcttccgcg ccgtgatcca catcccccgc 2580 
cgcatccgcc agggcctgga gcgcgccctg ctgtaagata tcggatcctc taga 2634 



<210> 65 
<211> 2538 



<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl60 .modUS4 .del VI 

<400> 65 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 r 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 3 60 
acccccctgt gcgtgaccct gaactgcacc gacaagctgg gcgccggcgg cgagatcaag 420 
aactgcagct tcaacatcac caccagcgtg cgcgacaagg tgcagaagga gtacagcctg 480 
ttctacaagc tggacgtggt gcccatcgac aacgacaacg ccagctaccg cctgatcaac 540 
tgcaacacca gcgtgatcac ccaggcctgc cccaaggtga gcttcgagcc catccccatc 60 0 
cactactgcg cccccgccgg cttcgccatc ctgaagtgca aggacaagaa gttcaacggc 660 
accggcccct gcaagaacgt gagcaccgtg cagtgcaccc acggcatccg ccccgtggtg 720 
agcacccagc tgctgctgaa cggcagcctg gccgaggagg agatcgtgct gcgctccgag 780 
aacttcaccg acaacgccaa gaccatcatc gtgcagctga acgagtccgt ggagatcaac 84 0 
tgcatccgcc ccaacaacaa cacgcgtaag agcatccaca tcggccccgg ccgcgccttc 900 
tacgccaccg gcgacatcat cggcgacatc cgccaggccc actgcaacat cagcaaggcc 960 
aactggacca acaccctcga gcagatcgtg gagaagctgc gcgagcagtt cggcaacaac 102 0 
aagaccatca tcttcaacag cagcagcggc ggcgaccccg agatcgtgtt ccacagcttc 10 80 
aactgcggcg gcgagttctt ctactgcaac accagccagc tgttcaacag cacctggaac 1140 
atcaccgagg aggtgaacaa gaccaaggag aacgacacca tcatcctgcc ctgccgcatc 1200 
cgccagatca tcaacatgtg gcaggaggtg ggcaaggcca tgtacgcccc ccccatccgc 1260 
ggccagatca agtgcagcag caatattacc ggcctgctgc tgacccgcga cggcggcacc 132 0 
aacaacaacc gcaccaacga caccgagacc ttccgccccg gcggcggcaa catgaaggac 13 80 
aactggcgca gcgagctgta caagtacaag gtggtgcgca tcgagcccct gggcgtggcc 1440 
cccacccagg ccaagcgccg cgtggtgcag cgcgagaagc gcgccgtggg cctgggcgcc 1500 
ctgttcatcg gcttcctggg cgccgccggg agcaccatgg gcgccgcctc cgtgaccctg 1560 
accgtgcagg cccgccagct gctgagcggc atcgtgcagc agcagaacaa cctgctgcgc 1620 
gccatcgagg cccagcagca cctgctgcag ctgaccgtgt ggggcatcaa gcagctgcag 1680 
gcccgcatcc tggccgtgga gcgctacctg aaggaccagc agctgctggg catctggggc 1740 
tgcagcggca agctgatctg caccaccacc gtgccctgga acagcagctg gagcaacaag 1800 
agcctgaccg agatctggga caacatgacc tggatggagt gggagcgcga gatcggcaac 1860 
tacaccggcc tgatctacaa cctgatcgag atcgcccaga accagcagga gaagaacgag 192 0 
caggagctgc tggagctgga caagtgggcc agcctgtgga actggttcga catcaccaac 1980 
tggctgtggt acatccgcat cttcatcatg atcgtgggcg gcctgatcgg cctgcgcatc 2 040 
gtgttcgccg tgctgagcat cgtgaaccgc gtgcgccagg gctacagccc catcagcctg 2100 
cagacccgcc tgcccgccca gcgcggcccc gaccgccccg agggcatcga ggaggagggc 2160 
ggcgagcgcg accgcgaccg cagcaaccgc ctggtgcacg gcctgctggc cctgatctgg 222 0 
gacgacctgc gcagcctgtg cctgttcagc taccaccgcc tgcgcgacct gctgctgatc 22 80 
gtggcccgca tcgtggagct gctgggccgc cgcggctggg aggccctgaa gtactggtgg 234 0 
aacctgctgc agtactggag ccaggagctg aagagcagcg ccgtgagcct gttcaacgcc 24 00 
accgccatcg ccgtggccga gggcaccgac cgcatcatcg agatcgtgca gcgcatcttc 2460 
cgcgccgtga tccacatccc ccgccgcatc cgccagggcc tggagcgcgc cctgctgtaa 2 52 0 
gatatcggat cctctaga 2 53 8 

<210> 66 
<211> 2553 



42 



<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl60 .modUS 4 . delV2 

<400> 66 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 ± 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gaactgcacc gacaagctga ccggcagcac caacggcacc 42 0 
aacagcacca gcggcaccaa cagcaccagc ggcaccaaca gcaccagcac caacagcacc 480 
gacagctggg agaagatgcc cgagggcgag atcaagaact gcagcttcaa catcggcgcc 540 
ggccgcctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 600 
gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaaggac 660 
aagaagttca acggcaccgg cccctgcaag aacgtgagca ccgtgcagtg cacccacggc 720 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggaggagatc 780 
gtgctgcgct ccgagaactt caccgacaac gccaagacca tcatcgtgca gctgaacgag 84 0 
tccgtggaga tcaactgcat ccgccccaac aacaacacgc gtaagagcat ccacatcggc 900 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 960 
aacatcagca aggccaactg gaccaacacc ctcgagcaga tcgtggagaa gctgcgcgag 102 0 
cagttcggca acaacaagac catcatcttc aacagcagca gcggcggcga ccccgagatc 10 80 
gtgttccaca gcttcaactg cggcggcgag ttcttctact gcaacaccag ccagctgttc 1140 
aacagcacct ggaacatcac cgaggaggtg aacaagacca aggagaacga caccatcatc 12 00 
ctgccctgcc gcatccgcca gatcatcaac atgtggcagg aggtgggcaa ggccatgtac 1260 
gcccccccca tccgcggcca gatcaagtgc agcagcaata ttaccggcct gctgctgacc 132 0 
cgcgacggcg gcaccaacaa caaccgcacc aacgacaccg agaccttccg ccccggcggc 13 80 
ggcaacatga aggacaactg gcgcagcgag ctgtacaagt acaaggtggt gcgcatcgag 1440 
cccctgggcg tggcccccac ccaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 150 0 
gtgggcctgg gcgccctgtt catcggcttc ctgggcgccg ccgggagcac catgggcgcc 1560 
gcctccgtga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 162 0 
aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 
atcaagcagc tgcaggcccg catcctggcc gtggagcgct acctgaagga ccagcagctg 174 0 
ctgggcatct ggggctgcag cggcaagctg atctgcacca ccaccgtgcc ctggaacagc 1800 
agctggagca acaagagcct gaccgagatc tgggacaaca tgacctggat ggagtgggag 1860 
cgcgagatcg gcaactacac cggcctgatc tacaacctga tcgagatcgc ccagaaccag 1920 
caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 1980 
ttcgacatca ccaactggct gtggtacatc cgcatcttca tcatgatcgt gggcggcctg 2040 
atcggcctgc gcatcgtgtt cgccgtgctg agcatcgtga accgcgtgcg ccagggctac 2100 
agccccatca gcctgcagac ccgcctgccc gcccagcgcg gccccgaccg ccccgagggc 2160 
atcgaggagg agggcggcga gcgcgaccgc gaccgcagca accgcctggt gcacggcctg 2220 
ctggccctga tctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 2280 
gacctgctgc tgatcgtggc ccgcatcgtg gagctgctgg gccgccgcgg ctgggaggcc 2340 
ctgaagtact ggtggaacct gctgcagtac tggagccagg agctgaagag cagcgccgtg 24 00 
agcctgttca acgccaccgc catcgccgtg gccgagggca ccgaccgcat catcgagatc 2460 
gtgcagcgca tcttccgcgc cgtgatccac atcccccgcc gcatccgcca gggcctggag 252 0 
cgcgccctgc tgtaagatat cggatcctct aga 2553 

<210> 67 
<211> 2340 



43 



<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl60 .modUS4 .delVl/V2 

<400> 67 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 6 0 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 ~ 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcgcc 360 
ggccaggcct gccccaaggt gagcttcgag cccatcccca tccactactg cgcccccgcc 420 
ggcttcgcca tcctgaagtg caaggacaag aagttcaacg gcaccggccc ctgcaagaac 480 
gtgagcaccg tgcagtgcac ccacggcatc cgccccgtgg tgagcaccca gctgctgctg 54 0 
aacggcagcc tggccgagga ggagatcgtg ctgcgctccg agaacttcac cgacaacgcc 600 
aagaccatca tcgtgcagct gaacgagtcc gtggagatca actgcatccg ccccaacaac 660 
aacacgcgta agagcatcca catcggcccc ggccgcgcct tctacgccac cggcgacatc 720 
atcggcgaca tccgccaggc ccactgcaac atcagcaagg ccaactggac caacaccctc 780 
gagcagatcg tggagaagct gcgcgagcag ttcggcaaca acaagaccat catcttcaac 840 
agcagcagcg gcggcgaccc cgagatcgtg ttccacagct tcaactgcgg cggcgagttc 900 
ttctactgca acaccagcca gctgttcaac agcacctgga acatcaccga ggaggtgaac 960 
aagaccaagg agaacgacac catcatcctg ccctgccgca tccgccagat catcaacatg 1020 
tggcaggagg tgggcaaggc catgtacgcc ccccccatcc gcggccagat caagtgcagc 1080 
agcaatatta ccggcctgct gctgacccgc gacggcggca ccaacaacaa ccgcaccaac 1140 
gacaccgaga ccttccgccc cggcggcggc aacatgaagg acaactggcg cagcgagctg 12 00 
tacaagtaca aggtggtgcg catcgagccc ctgggcgtgg cccccaccca ggccaagcgc 1260 
cgcgtggtgc agcgcgagaa gcgcgccgtg ggcctgggcg ccctgttcat cggcttcctg 132 0 
ggcgccgccg ggagcaccat gggcgccgcc tccgtgaccc tgaccgtgca ggcccgccag 1380 
ctgctgagcg gcatcgtgca gcagcagaac aacctgctgc gcgccatcga ggcccagcag 144 0 
cacctgctgc agctgaccgt gtggggcatc aagcagctgc aggcccgcat cctggccgtg 1500 
gagcgctacc tgaaggacca gcagctgctg ggcatctggg gctgcagcgg caagctgatc 1560 
tgcaccacca ccgtgccctg gaacagcagc tggagcaaca agagcctgac cgagatctgg 162 0 
gacaacatga cctggatgga gtgggagcgc gagatcggca actacaccgg cctgatctac 168 0 
aacctgatcg agatcgccca gaaccagcag gagaagaacg agcaggagct gctggagctg 1740 
gacaagtggg ccagcctgtg gaactggttc gacatcacca actggctgtg gtacatccgc 1800 
atcttcatca tgatcgtggg cggcctgatc ggcctgcgca tcgtgttcgc cgtgctgagc 1860 
atcgtgaacc gcgtgcgcca gggctacagc cccatcagcc tgcagacccg cctgcccgcc 192 0 
cagcgcggcc ccgaccgccc cgagggcatc gaggaggagg gcggcgagcg cgaccgcgac 1980 
cgcagcaacc gcctggtgca cggcctgctg gccctgatct gggacgacct gcgcagcctg 2 04 0 
tgcctgttca gctaccaccg cctgcgcgac ctgctgctga tcgtggcccg catcgtggag 2100 
ctgctgggcc gccgcggctg ggaggccctg aagtactggt ggaacctgct gcagtactgg 216 0 
agccaggagc tgaagagcag cgccgtgagc ctgttcaacg ccaccgccat cgccgtggcc 222 0 
gagggcaccg accgcatcat cgagatcgtg cagcgcatct tccgcgccgt gatccacatc 22 80 
ccccgccgca tccgccaggg cctggagcgc gccctgctgt aagatatcgg atcctctaga 2340 

<210> 68 
<211> 2385 
<212> DNA 

<213> Artificial Sequence 
<220> 



44 



<223> Description of Artificial Sequence: 
gpl60.modUS4del 128-194 

<400> 68 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggggc agggaactgc gagaccagcg tgatcaccca ggcctgcccc 420 ; 
aaggtgagct tcgagcccat ccccatccac tactgcgccc ccgccggctt cgccatcctg 480 
aagtgcaagg acaagaagtt caacggcacc ggcccctgca agaacgtgag caccgtgcag 54 0 
tgcacccacg gcatccgccc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 600 
gaggaggaga tcgtgctgcg ctccgagaac ttcaccgaca acgccaagac catcatcgtg 660 
cagctgaacg agtccgtgga gatcaactgc atccgcccca acaacaacac gcgtaagagc 72 0 
atccacatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 780 
caggcccact gcaacatcag caaggccaac tggaccaaca ccctcgagca gatcgtggag 84 0 
aagctgcgcg agcagttcgg caacaacaag accatcatct tcaacagcag cagcggcggc 900 
gaccccgaga tcgtgttcca cagcttcaac tgcggcggcg agttcttcta ctgcaacacc 960 
agccagctgt tcaacagcac ctggaacatc accgaggagg tgaacaagac caaggagaac 102 0 
gacaccatca tcctgccctg ccgcatccgc cagatcatca acatgtggca ggaggtgggc 1080 
aaggccatgt acgccccccc catccgcggc cagatcaagt gcagcagcaa tattaccggc 114 0 
ctgctgctga cccgcgacgg cggcaccaac aacaaccgca ccaacgacac cgagaccttc 120 0 
cgccccggcg gcggcaacat gaaggacaac tggcgcagcg agctgtacaa gtacaaggtg 1260 
gtgcgcatcg agcccctggg cgtggccccc acccaggcca agcgccgcgt ggtgcagcgc 132 0 
gagaagcgcg ccgtgggcct gggcgccctg ttcatcggct tcctgggcgc cgccgggagc 13 80 
accatgggcg ccgcctccgt gaccctgacc gtgcaggccc gccagctgct gagcggcatc 144 0 
gtgcagcagc agaacaacct gctgcgcgcc atcgaggccc agcagcacct gctgcagctg 1500 
accgtgtggg gcatcaagca gctgcaggcc cgcatcctgg ccgtggagcg ctacctgaag 1560 
gaccagcagc tgctgggcat ctggggctgc agcggcaagc tgatctgcac caccaccgtg 162 0 
ccctggaaca gcagctggag caacaagagc ctgaccgaga tctgggacaa catgacctgg 1680 
atggagtggg agcgcgagat cggcaactac accggcctga tctacaacct gatcgagatc 174 0 
gcccagaacc agcaggagaa gaacgagcag gagctgctgg agctggacaa gtgggccagc 1800 
ctgtggaact ggttcgacat caccaactgg ctgtggtaca tccgcatctt catcatgatc 1860 
gtgggcggcc tgatcggcct gcgcatcgtg ttcgccgtgc tgagcatcgt gaaccgcgtg 192 0 
cgccagggct acagccccat cagcctgcag acccgcctgc ccgcccagcg cggccccgac 1980 
cgccccgagg gcatcgagga ggagggcggc gagcgcgacc gcgaccgcag caaccgcctg 2 04 0 
gtgcacggcc tgctggccct gatctgggac gacctgcgca gcctgtgcct gttcagctac 2100 
caccgcctgc gcgacctgct gctgatcgtg gcccgcatcg tggagctgct gggccgccgc 2160 
ggctgggagg ccctgaagta ctggtggaac ctgctgcagt actggagcca ggagctgaag 222 0 
agcagcgccg tgagcctgtt caacgccacc gccatcgccg tggccgaggg caccgaccgc 22 80 
atcatcgaga tcgtgcagcg catcttccgc gccgtgatcc acatcccccg ccgcatccgc 234 0 
cagggcctgg agcgcgccct gctgtaagat atcggatcct ctaga 2385 

<210> 69 
<211> 144 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 69 

gacaccatca tcctgccctg ccgcatccgc cagatcatca acatgtggca ggaggtgggc 60 
aaggccatgt acgccccccc catccgcggc cagatcaagt gcagcagcaa catcaccggc 120 
ctgctgctga cccgcgacgg cggc I 44 
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<210> 70 
<211> 144 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 70 

ggaactatca cactcccatg cagaataaaa caaattataa acaggtggca ggaagtagga 60 
aaagcaatgt atgcccctcc catcagagga caaattagat gctcatcaaa tattacagga 12 0 
ctgctattaa caagagatgg tggt 144 

<210> 71 ^ 
<211> 144 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic Env 
US 4 common region 

<400> 71 

gacaccatca tcctgccctg ccgcatccgc cagatcatca acatgtggca ggaggtgggc 6 0 

aaggccatgt acgccccccc catccgcggc cagatcaagt gcagcagcaa catcaccggc 12 0 

ctgctgctga cccgcgacgg cggc 144 

<210> 72 
<211> 144 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic Env 
SF162 common region 

<400> 72 

ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctggca ggaggtgggc 60 

aaggccatgt acgccccccc catccgcggc cagatccgct gcagcagcaa catcaccggc 12 0 
ctgctgctga cccgcgacgg cggc 144 

<210> 73 
<211> 4766 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl60 .modUS4 ,gag.modSF2 

<400> 73 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 

gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 

gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 

cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 3 60 
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acccccctgt gcgtgaccct gaactgcacc gacaagctga ccggcagcac caacggcacc 420 
aacagcacca gcggcaccaa cagcaccagc ggcaccaaca gcaccagcac caacagcacc 4 80 
gacagctggg agaagatgcc cgagggcgag atcaagaact gcagcttcaa catcaccacc 540 
agcgtgcgcg acaaggtgca gaaggagtac agcctgttct acaagctgga cgtggtgccc 600 
atcgacaacg acaacgccag ctaccgcctg atcaactgca acaccagcgt gatcacccag 660 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 72 0 
gccatcctga agtgcaagga caagaagttc aacggcaccg gcccctgcaa gaacgtgagc 780 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 840 
agcctggccg aggaggagat cgtgctgcgc tccgagaact tcaccgacaa cgccaagacc 90 0 
atcatcgtgc agctgaacga gtccgtggag atcaactgca tccgccccaa caacaacacg 960 
cgtaagagca tccacatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 102G 
gacatccgcc aggcccactg caacatcagc aaggccaact ggaccaacac cctcgagcag 1080 
atcgtggaga agctgcgcga gcagttcggc aacaacaaga ccatcatctt caacagcagc 1140 
agcggcggcg accccgagat cgtgttccac agcttcaact gcggcggcga gttcttctac 12 00 
tgcaacacca gccagctgtt caacagcacc tggaacatca ccgaggaggt gaacaagacc 12 60 
aaggagaacg acaccatcat cctgccctgc cgcatccgcc agatcatcaa catgtggcag 132 0 
gaggtgggca aggccatgta cgcccccccc atccgcggcc agatcaagtg cagcagcaat 13 80 
attaccggcc tgctgctgac ccgcgacggc ggcaccaaca acaaccgcac caacgacacc 144 0 
gagaccttcc gccccggcgg cggcaacatg aaggacaact ggcgcagcga gctgtacaag 15 00 
tacaaggtgg tgcgcatcga gcccctgggc gtggccccca cccaggccaa gcgccgcgtg 1560 
gtgcagcgcg agaagcgcgc cgtgggcctg ggcgccctgt tcatcggctt cctgggcgcc 162 0 
gccgggagca ccatgggcgc cgcctccgtg accctgaccg tgcaggcccg ccagctgctg 168 0 
agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 174 0 
ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcatcctggc cgtggagcgc 18 00 
tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1860 
accaccgtgc cctggaacag cagctggagc aacaagagcc tgaccgagat ctgggacaac 192 0 
atgacctgga tggagtggga gcgcgagatc ggcaactaca ccggcctgat ctacaacctg 198 0 
atcgagatcg cccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 2 04 0 
tgggccagcc tgtggaactg gttcgacatc accaactggc tgtggtacat ccgcatcttc 2100 
atcatgatcg tgggcggcct gatcggcctg cgcatcgtgt tcgccgtgct gagcatcgtg 2160 
aaccgcgtgc gccagggcta cagccccatc agcctgcaga cccgcctgcc cgcccagcgc 2220 
ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 2280 
aaccgcctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg 2340 
ttcagctacc accgcctgcg cgacctgctg ctgatcgtgg cccgcatcgt ggagctgctg 2400 
ggccgccgcg gctgggaggc cctgaagtac tggtggaacc tgctgcagta ctggagccag 2460 
gagctgaaga gcagcgccgt gagcctgttc aacgccaccg ccatcgccgt ggccgagggc 252 0 
accgaccgca tcatcgagat cgtgcagcgc atcttccgcg ccgtgatcca catcccccgc 2580 
cgcatccgcc agggcctgga gcgcgccctg ctgtaagata tcggatcctc tagagaattc 2640 
cgcccccccc cccccccccc ctctccctcc ccccccccta acgttactgg ccgaagccgc 2700 
ttggaataag gccggtgtgc gtttgtctat atgttatttt ccaccatatt gccgtctttt 2760 
ggcaatgtga gggcccggaa acctggccct gtcttcttga cgagcattcc taggggtctt 2 82 0 
tcccctctcg ccaaaggaat gcaaggtctg ttgaatgtcg tgaaggaagc agttcctctg 2880 
gaagcttctt gaagacaaac aacgtctgta gcgacccttt gcaggcagcg gaacccccca 2 940 
cctggcgaca ggtgcctctg cggccaaaag ccacgtgtat aagatacacc tgcaaaggcg 3 000 
gcacaacccc agtgccacgt tgtgagttgg atagttgtgg aaagagtcaa atggctctcc 3060 
tcaagcgtat tcaacaaggg gctgaaggat gcccagaagg taccccattg tatgggatct 312 0 
gatctggggc ctcggtgcac atgctttaca tgtgtttagt cgaggttaaa aaaacgtcta 3180 
ggccccccga accacgggga cgtggttttc ctttgaaaaa cacgataata ccatgggcgc 3240 
ccgcgccagc gtgctgagcg gcggcgagct ggacaagtgg gagaagatcc gcctgcgccc 330 0 
cggcggcaag aagaagtaca agctgaagca catcgtgtgg gccagccgcg agctggagcg 3360 
cttcgccgtg aaccccggcc tgctggagac cagcgagggc tgccgccaga tcctgggcca 342 0 
gctgcagccc agcctgcaga ccggcagcga ggagctgcgc agcctgtaca acaccgtggc 3480 
caccctgtac tgcgtgcacc agcgcatcga cgtcaaggac accaaggagg ccctggagaa 3540 
gatcgaggag gagcagaaca agtccaagaa gaaggcccag caggccgccg ccgccgccgg 3 60 0 
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caccggcaac agcagccagg tgagccagaa ctaccccatc gtgcagaacc tgcagggcca 3660 
gatggtgcac caggccatca gcccccgcac cctgaacgcc tgggtgaagg tggtggagga 372 0 
gaaggccttc agccccgagg tgatccccat gttcagcgcc ctgagcgagg gcgccacccc 3780 
ccaggacctg aacacgatgt tgaacaccgt gggcggccac caggccgcca tgcagatgct 3840 
gaaggagacc atcaacgagg aggccgccga gtgggaccgc gtgcaccccg tgcacgccgg 3 900 
ccccatcgcc cccggccaga tgcgcgagcc ccgcggcagc gacatcgccg gcaccaccag 3960 
caccctgcag gagcagatcg gctggatgac caacaacccc cccatccccg tgggcgagat 402 0 
ctacaagcgg tggatcatcc tgggcctgaa caagatcgtg cggatgtaca gccccaccag 4 08 0 
catcctggac atccgccagg gccccaagga gcccttccgc gactacgtgg accgcttcta 414 0 
caagaccctg cgcgctgagc aggccagcca ggacgtgaag aactggatga ccgagaccct 4200 
gctggtgcag aacgccaacc ccgactgcaa gaccatcctg aaggctctcg gccccgcggc 42 60 
caccctggag gagatgatga ccgcctgcca gggcgtgggc ggccccggcc acaaggcccg 432 0 
cgtgctggcc gaggcgatga gccaggtgac gaacccggcg accatcatga tgcagcgcgg 43 80 
caacttccgc aaccagcgga agaccgtcaa gtgcttcaac tgcggcaagg agggccacac 444 0 
cgccaggaac tgccgcgccc cccgcaagaa gggctgctgg cgctgcggcc gcgagggcca 4500 
ccagatgaag gactgcaccg agcgccaggc caacttcctg ggcaagatct ggcccagcta 45 60 
caagggccgc cccggcaact tcctgcagag ccgccccgag cccaccgccc cccccgagga 4 62 0 
gagcttccgc ttcggcgagg agaagaccac ccccagccag aagcaggagc ccatcgacaa 4680 
ggagctgtac cccctgacca gcctgcgcag cctgttcggc aacgacccca gcagccagta 474 0 
agaattcaga ctcgagcaag tctaga 4766 

<210> 74 
<211> 4689 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl60 . modSF162 . gag . modSF2 

<400> 74 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 30 0 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 72 0 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 7 80 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 102 0 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 108 0 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 114 0 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 12 00 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctggca ggaggtgggc 1260 
aaggccatgt acgccccccc catccgcggc cagatccgct gcagcagcaa catcaccggc 132 0 
ctgctgctga cccgcgacgg cggcaaggag atcagcaaca ccaccgagat cttccgcccc 13 80 
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ggcggcggcg acatgcgcga caactggcgc agcgagctgt acaagtacaa ggtggtgaag 144 0 
atcgagcccc tgggcgtggc ccccaccaag gccaagcgcc gcgtggtgca gcgcgagaag 1500 
cgcgccgtga ccctgggcgc catgttcctg ggcttcctgg gcgccgccgg cagcaccatg 1560 
ggcgcccgca gcctgaccct gaccgtgcag gcccgccagc tgctgagcgg catcgtgcag 1620 
cagcagaaca acctgctgcg cgccatcgag gcccagcagc acctgctgca gctgaccgtg 1680 
tggggcatca agcagctgca ggcccgcgtg ctggccgtgg agcgctacct gaaggaccag 1740 
cagctgctgg gcatctgggg ctgcagcggc aagctgatct gcaccaccgc cgtgccctgg 1800 
aacgccagct ggagcaacaa gagcctggac cagatctgga acaacatgac ctggatggag 1860 
tgggagcgcg agatcgacaa ctacaccaac ctgatctaca ccctgatcga ggagagccag 1920 
aaccagcagg agaagaacga gcaggagctg ctggagctgg acaagtgggc cagcctgtgg 1980 
aactggttcg acatcagcaa gtggctgtgg tacatcaaga tcttcatcat gatcgtgggc 2049 
ggcctggtgg gcctgcgcat cgtgttcacc gtgctgagca tcgtgaaccg cgtgcgccag 2100 
ggctacagcc ccctgagctt ccagacccgc ttccccgccc cccgcggccc cgaccgcccc 2160 
gagggcatcg aggaggaggg cggcgagcgc gaccgcgacc gcagcagccc cctggtgcac 2220 
ggcctgctgg ccctgatctg ggacgacctg cgcagcctgt gcctgttcag ctaccaccgc 2280 
ctgcgcgacc tgatcctgat cgccgcccgc atcgtggagc tgctgggccg ccgcggctgg 2340 
gaggccctga agtactgggg caacctgctg cagtactgga tccaggagct gaagaacagc 2400 
gccgtgagcc tgttcgacgc catcgccatc gccgtggccg agggcaccga ccgcatcatc 2460 
gaggtggccc agcgcatcgg ccgcgccttc ctgcacatcc cccgccgcat ccgccagggc 252 0 
ttcgagcgcg ccctgctgta actcgagcaa gtctagagaa ttccgccccc cccccccccc 2580 
cccctctccc tccccccccc ctaacgttac tggccgaagc cgcttggaat aaggccggtg 2 640 
tgcgtttgtc tatatgttat tttccaccat attgccgtct tttggcaatg tgagggcccg 2 70 0 
gaaacctggc cctgtcttct tgacgagcat tcctaggggt ctttcccctc tcgccaaagg 2760 
aatgcaaggt ctgttgaatg tcgtgaagga agcagttcct ctggaagctt cttgaagaca 2820 
aacaacgtct gtagcgaccc tttgcaggca gcggaacccc ccacctggcg acaggtgcct 2880 
ctgcggccaa aagccacgtg tataagatac acctgcaaag gcggcacaac cccagtgcca 2 940 
cgttgtgagt tggatagttg tggaaagagt caaatggctc tcctcaagcg tattcaacaa 3 00 0 
ggggctgaag gatgcccaga aggtacccca ttgtatggga tctgatctgg ggcctcggtg 3 060 
cacatgcttt acatgtgttt agtcgaggtt aaaaaaacgt ctaggccccc cgaaccacgg 312 0 
ggacgtggtt ttcctttgaa aaacacgata ataccatggg cgcccgcgcc agcgtgctga 3180 
gcggcggcga gctggacaag tgggagaaga tccgcctgcg ccccggcggc aagaagaagt 324 0 
acaagctgaa gcacatcgtg tgggccagcc gcgagctgga gcgcttcgcc gtgaaccccg 33 00 
gcctgctgga gaccagcgag ggctgccgcc agatcctggg ccagctgcag cccagcctgc 3360 
agaccggcag cgaggagctg cgcagcctgt acaacaccgt ggccaccctg tactgcgtgc 342 0 
accagcgcat cgacgtcaag gacaccaagg aggccctgga gaagatcgag gaggagcaga 3480 
acaagtccaa gaagaaggcc cagcaggccg ccgccgccgc cggcaccggc aacagcagcc 354 0 
aggtgagcca gaactacccc atcgtgcaga acctgcaggg ccagatggtg caccaggcca 3600 
tcagcccccg caccctgaac gcctgggtga aggtggtgga ggagaaggcc ttcagccccg 3660 
aggtgatccc catgttcagc gccctgagcg agggcgccac cccccaggac ctgaacacga 3720 
tgttgaacac cgtgggcggc caccaggccg ccatgcagat gctgaaggag accatcaacg 3780 
aggaggccgc cgagtgggac cgcgtgcacc ccgtgcacgc cggccccatc gcccccggcc 384 0 
agatgcgcga gccccgcggc agcgacatcg ccggcaccac cagcaccctg caggagcaga 3900 
tcggctggat gaccaacaac ccccccatcc ccgtgggcga gatctacaag cggtggatca 3 960 
tcctgggcct gaacaagatc gtgcggatgt acagccccac cagcatcctg gacatccgcc 4020 
agggccccaa ggagcccttc cgcgactacg tggaccgctt ctacaagacc ctgcgcgctg 4 080 
agcaggccag ccaggacgtg aagaactgga tgaccgagac cctgctggtg cagaacgcca 414 0 
accccgactg caagaccatc ctgaaggctc tcggccccgc ggccaccctg gaggagatga 4200 
tgaccgcctg ccagggcgtg ggcggccccg gccacaaggc ccgcgtgctg gccgaggcga 4260 
tgagccaggt gacgaacccg gcgaccatca tgatgcagcg cggcaacttc cgcaaccagc 4 320 
ggaagaccgt caagtgcttc aactgcggca aggagggcca caccgccagg aactgccgcg 43 8 0 
ccccccgcaa gaagggctgc tggcgctgcg gccgcgaggg ccaccagatg aaggactgca 444 0 
ccgagcgcca ggccaacttc ctgggcaaga tctggcccag ctacaagggc cgccccggca 4500 
acttcctgca gagccgcccc gagcccaccg ccccccccga ggagagcttc cgcttcggcg 4560 
aggagaagac cacccccagc cagaagcagg agcccatcga caaggagctg taccccctga 4620 
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ccagcctgcg cagcctgttc ggcaacgacc ccagcagcca gtaagaattc agactcgagc 4680 
aagtctaga 4689 

<210> 75 
<211> 4472 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

gpl60.modUS4 .delVl/V2 .gag.modSF2 - 

<400> 75 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 18 0 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcgcc 3 60 
ggccaggcct gccccaaggt gagcttcgag cccatcccca tccactactg cgcccccgcc 420 
ggcttcgcca tcctgaagtg caaggacaag aagttcaacg gcaccggccc ctgcaagaac 48 0 
gtgagcaccg tgcagtgcac ccacggcatc cgccccgtgg tgagcaccca gctgctgctg 540 
aacggcagcc tggccgagga ggagatcgtg ctgcgctccg agaacttcac cgacaacgcc 600 
aagaccatca tcgtgcagct gaacgagtcc gtggagatca actgcatccg ccccaacaac 660 
aacacgcgta agagcatcca catcggcccc ggccgcgcct tctacgccac cggcgacatc 72 0 
atcggcgaca tccgccaggc ccactgcaac atcagcaagg ccaactggac caacaccctc 780 
gagcagatcg tggagaagct gcgcgagcag ttcggcaaca acaagaccat catcttcaac 840 
agcagcagcg gcggcgaccc cgagatcgtg ttccacagct tcaactgcgg cggcgagttc 900 
ttctactgca acaccagcca gctgttcaac agcacctgga acatcaccga ggaggtgaac 960 
aagaccaagg agaacgacac catcatcctg ccctgccgca tccgccagat catcaacatg 102 0 
tggcaggagg tgggcaaggc catgtacgcc ccccccatcc gcggccagat caagtgcagc 1080 
agcaatatta ccggcctgct gctgacccgc gacggcggca ccaacaacaa ccgcaccaac 1140 
gacaccgaga ccttccgccc cggcggcggc aacatgaagg acaactggcg cagcgagctg 12 0 0 
tacaagtaca aggtggtgcg catcgagccc ctgggcgtgg cccccaccca ggccaagcgc 1260 
cgcgtggtgc agcgcgagaa gcgcgccgtg ggcctgggcg ccctgttcat cggcttcctg 132 0 
ggcgccgccg ggagcaccat gggcgccgcc tccgtgaccc tgaccgtgca ggcccgccag 13 80 
ctgctgagcg gcatcgtgca gcagcagaac aacctgctgc gcgccatcga ggcccagcag 1440 
cacctgctgc agctgaccgt gtggggcatc aagcagctgc aggcccgcat cctggccgtg 15 0 0 
gagcgctacc tgaaggacca gcagctgctg ggcatctggg gctgcagcgg caagctgatc 1560 
tgcaccacca ccgtgccctg gaacagcagc tggagcaaca agagcctgac cgagatctgg 1620 
gacaacatga cctggatgga gtgggagcgc gagatcggca actacaccgg cctgatctac 1680 
aacctgatcg agatcgccca gaaccagcag gagaagaacg agcaggagct gctggagctg 174 0 
gacaagtggg ccagcctgtg gaactggttc gacatcacca actggctgtg gtacatccgc 1800 
atcttcatca tgatcgtggg cggcctgatc ggcctgcgca tcgtgttcgc cgtgctgagc 1860 
atcgtgaacc gcgtgcgcca gggctacagc cccatcagcc tgcagacccg cctgcccgcc 192 0 
cagcgcggcc ccgaccgccc cgagggcatc gaggaggagg gcggcgagcg cgaccgcgac 198 0 
cgcagcaacc gcctggtgca cggcctgctg gccctgatct gggacgacct gcgcagcctg 204 0 
tgcctgttca gctaccaccg cctgcgcgac ctgctgctga tcgtggcccg catcgtggag 2100 
ctgctgggcc gccgcggctg ggaggccctg aagtactggt ggaacctgct gcagtactgg 2160 
agccaggagc tgaagagcag cgccgtgagc ctgttcaacg ccaccgccat cgccgtggcc 222 0 
gagggcaccg accgcatcat cgagatcgtg cagcgcatct tccgcgccgt gatccacatc 22 8 0 
ccccgccgca tccgccaggg cctggagcgc gccctgctgt aagatatcgg atcctctaga 2340 
gaattccgcc cccccccccc ccccccctct ccctcccccc cccctaacgt tactggccga 2400 
agccgcttgg aataaggccg gtgtgcgttt gtctatatgt tattttccac catattgccg 2460 
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tcttttggca atgtgagggc ccggaaacct ggccctgtct tcttgacgag cattcctagg 252 0 
ggtctttccc ctctcgccaa aggaatgcaa ggtctgttga atgtcgtgaa ggaagcagtt 25 80 
cctctggaag cttcttgaag acaaacaacg tctgtagcga ccctttgcag gcagcggaac 264 0 
cccccacctg gcgacaggtg cctctgcggc caaaagccac gtgtataaga tacacctgca 2700 
aaggcggcac aaccccagtg ccacgttgtg agttggatag ttgtggaaag agtcaaatgg 276 0 
ctctcctcaa gcgtattcaa caaggggctg aaggatgccc agaaggtacc ccattgtatg 282 0 
ggatctgatc tggggcctcg gtgcacatgc tttacatgtg tttagtcgag gttaaaaaaa 2880 
cgtctaggcc ccccgaacca cggggacgtg gttttccttt gaaaaacacg ataataccat 2940 
gggcgcccgc gccagcgtgc tgagcggcgg cgagctggac aagtgggaga agatccgcct 300 0 
gcgccccggc ggcaagaaga agtacaagct gaagcacatc gtgtgggcca gccgcgagct 3060 
ggagcgcttc gccgtgaacc ccggcctgct ggagaccagc gagggctgcc gccagatcct 312Q 
gggccagctg cagcccagcc tgcagaccgg cagcgaggag ctgcgcagcc tgtacaacac 3180 
cgtggccacc ctgtactgcg tgcaccagcg catcgacgtc aaggacacca aggaggccct 324 0 
ggagaagatc gaggaggagc agaacaagtc caagaagaag gcccagcagg ccgccgccgc 3300 
cgccggcacc ggcaacagca gccaggtgag ccagaactac cccatcgtgc agaacctgca 3360 
gggccagatg gtgcaccagg ccatcagccc ccgcaccctg aacgcctggg tgaaggtggt 3420 
ggaggagaag gccttcagcc ccgaggtgat ccccatgttc agcgccctga gcgagggcgc 34 80 
caccccccag gacctgaaca cgatgttgaa caccgtgggc ggccaccagg ccgccatgca 3540 
gatgctgaag gagaccatca acgaggaggc cgccgagtgg gaccgcgtgc accccgtgca 3600 
cgccggcccc atcgcccccg gccagatgcg cgagccccgc ggcagcgaca tcgccggcac 3660 
caccagcacc ctgcaggagc agatcggctg gatgaccaac aaccccccca tccccgtggg 3720 
cgagatctac aagcggtgga tcatcctggg cctgaacaag atcgtgcgga tgtacagccc 3780 
caccagcatc ctggacatcc gccagggccc caaggagccc ttccgcgact acgtggaccg 3840 
cttctacaag accctgcgcg ctgagcaggc cagccaggac gtgaagaact ggatgaccga 3 900 
gaccctgctg gtgcagaacg ccaaccccga ctgcaagacc atcctgaagg ctctcggccc 3 960 
cgcggccacc ctggaggaga tgatgaccgc ctgccagggc gtgggcggcc ccggccacaa 4020 
ggcccgcgtg ctggccgagg cgatgagcca ggtgacgaac ccggcgacca tcatgatgca 4080 
gcgcggcaac ttccgcaacc agcggaagac cgtcaagtgc ttcaactgcg gcaaggaggg 4140 
ccacaccgcc aggaactgcc gcgccccccg caagaagggc tgctggcgct gcggccgcga 4200 
gggccaccag atgaaggact gcaccgagcg ccaggccaac ttcctgggca agatctggcc 4260 
cagctacaag ggccgccccg gcaacttcct gcagagccgc cccgagccca ccgccccccc 432 0 
cgaggagagc ttccgcttcg gcgaggagaa gaccaccccc agccagaagc aggagcccat 43 80 
cgacaaggag ctgtaccccc tgaccagcct gcgcagcctg ttcggcaacg accccagcag 444 0 
ccagtaagaa ttcagactcg agcaagtcta ga 4472 



<210> 76 
<211> 4608 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
gp!60.modSF162 . delV2 . gag .modSF2 



<400> 76 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
acccccctgt gcgtgaccct gcactgcacc 
agcaactgga aggagatgga ccgcggcgag 
ggcaagctga tcaactgcaa caccagcgtg 



gggctctgct gtgtgctgct gctgtgtgga 60 
aagctgtggg tgaccgtgta ctacggcgtg 12 0 
ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgaagctg 360 
aacctgaaga acgccaccaa caccaagagc 420 
atcaagaact gcagcttcaa ggtgggcgcc 480 
atcacccagg cctgccccaa ggtgagcttc 540 
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gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaacgac 600 
aagaagttca acggcagcgg cccctgcacc aacgtgagca ccgtgcagtg cacccacggc 660 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggagggcgtg 72 0 
gtgatccgca gcgagaactt caccgacaac gccaagacca tcatcgtgca gctgaaggag 7 80 
agcgtggaga tcaactgcac ccgccccaac aacaacaccc gcaagagcat caccatcggc 840 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 900 
aacatcagcg gcgagaagtg gaacaacacc ctgaagcaga tcgtgaccaa gctgcaggcc 960 
cagttcggca acaagaccat cgtgttcaag cagagcagcg gcggcgaccc cgagatcgtg 102 0 
atgcacagct tcaactgcgg cggcgagttc ttctactgca acagcaccca gctgttcaac 1080 
agcacctgga acaacaccat cggccccaac aacaccaacg gcaccatcac cctgccctgc 1140 
cgcatcaagc agatcatcaa ccgctggcag gaggtgggca aggccatgta cgcccccccc 1200 
atccgcggcc agatccgctg cagcagcaac atcaccggcc tgctgctgac ccgcgacggc 1260 
ggcaaggaga tcagcaacac caccgagatc ttccgccccg gcggcggcga catgcgcgac 1320 
aactggcgca gcgagctgta caagtacaag gtggtgaaga tcgagcccct gggcgtggcc 13 80 
cccaccaagg ccaagcgccg cgtggtgcag cgcgagaagc gcgccgtgac cctgggcgcc 144 0 
atgttcctgg gcttcctggg cgccgccggc agcaccatgg gcgcccgcag cctgaccctg 1500 
accgtgcagg cccgccagct gctgagcggc atcgtgcagc agcagaacaa cctgctgcgc 1560 
gccatcgagg cccagcagca cctgctgcag ctgaccgtgt ggggcatcaa gcagctgcag 1620 
gcccgcgtgc tggccgtgga gcgctacctg aaggaccagc agctgctggg catctggggc 1680 
tgcagcggca agctgatctg caccaccgcc gtgccctgga acgccagctg gagcaacaag 1740 
agcctggacc agatctggaa caacatgacc tggatggagt gggagcgcga gatcgacaac 1800 
tacaccaacc tgatctacac cctgatcgag gagagccaga accagcagga gaagaacgag 186 0 
caggagctgc tggagctgga caagtgggcc agcctgtgga actggttcga catcagcaag 192 0 
tggctgtggt acatcaagat cttcatcatg atcgtgggcg gcctggtggg cctgcgcatc 1980 
gtgttcaccg tgctgagcat cgtgaaccgc gtgcgccagg gctacagccc cctgagcttc 2 04 0 
cagacccgct tccccgcccc ccgcggcccc gaccgccccg agggcatcga ggaggagggc 2100 
ggcgagcgcg accgcgaccg cagcagcccc ctggtgcacg gcctgctggc cctgatctgg 2160 
gacgacctgc gcagcctgtg cctgttcagc taccaccgcc tgcgcgacct gatcctgatc 2220 
gccgcccgca tcgtggagct gctgggccgc cgcggctggg aggccctgaa gtactggggc 2280 
aacctgctgc agtactggat ccaggagctg aagaacagcg ccgtgagcct gttcgacgcc 2340 
atcgccatcg ccgtggccga gggcaccgac cgcatcatcg aggtggccca gcgcatcggc 2400 
cgcgccttcc tgcacatccc ccgccgcatc cgccagggct tcgagcgcgc cctgctgtaa 2460 
ctcgagcaag tctagagaat tccgcccccc cccccccccc ccctctccct cccccccccc 2520 
taacgttact ggccgaagcc gcttggaata aggccggtgt gcgtttgtct atatgttatt 2580 
ttccaccata ttgccgtctt ttggcaatgt gagggcccgg aaacctggcc ctgtcttctt 264 0 
gacgagcatt cctaggggtc tttcccctct cgccaaagga atgcaaggtc tgttgaatgt 2700 
cgtgaaggaa gcagttcctc tggaagcttc ttgaagacaa acaacgtctg tagcgaccct 2760 
ttgcaggcag cggaaccccc cacctggcga caggtgcctc tgcggccaaa agccacgtgt 2 820 
ataagataca cctgcaaagg cggcacaacc ccagtgccac gttgtgagtt ggatagttgt 288 0 
ggaaagagtc aaatggctct cctcaagcgt attcaacaag gggctgaagg atgcccagaa 2940 
ggtaccccat tgtatgggat ctgatctggg gcctcggtgc acatgcttta catgtgttta 3000 
gtcgaggtta aaaaaacgtc taggcccccc gaaccacggg gacgtggttt tcctttgaaa 3 060 
aacacgataa taccatgggc gcccgcgcca gcgtgctgag cggcggcgag ctggacaagt 312 0 
gggagaagat ccgcctgcgc cccggcggca agaagaagta caagctgaag cacatcgtgt 3180 
gggccagccg cgagctggag cgcttcgccg tgaaccccgg cctgctggag accagcgagg 3240 
gctgccgcca gatcctgggc cagctgcagc ccagcctgca gaccggcagc gaggagctgc 3 30 0 
gcagcctgta caacaccgtg gccaccctgt actgcgtgca ccagcgcatc gacgtcaagg 3360 
acaccaagga ggccctggag aagatcgagg aggagcagaa caagtccaag aagaaggccc 342 0 
agcaggccgc cgccgccgcc ggcaccggca acagcagcca ggtgagccag aactacccca 3480 
tcgtgcagaa cctgcagggc cagatggtgc accaggccat cagcccccgc accctgaacg 3 54 0 
cctgggtgaa ggtggtggag gagaaggcct tcagccccga ggtgatcccc atgttcagcg 3600 
ccctgagcga gggcgccacc ccccaggacc tgaacacgat gttgaacacc gtgggcggcc 3660 
accaggccgc catgcagatg ctgaaggaga ccatcaacga ggaggccgcc gagtgggacc 372 0 
gcgtgcaccc cgtgcacgcc ggccccatcg cccccggcca gatgcgcgag ccccgcggca 3 780 
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gcgacatcgc cggcaccacc agcaccctgc aggagcagat cggctggatg accaacaacc 3 840 
cccccatccc cgtgggcgag atctacaagc ggtggatcat cctgggcctg aacaagatcg 3900 
tgcggatgta cagccccacc agcatcctgg acatccgcca gggccccaag gagcccttcc 3960 
gcgactacgt ggaccgcttc tacaagaccc tgcgcgctga gcaggccagc caggacgtga 4020 
agaactggat gaccgagacc ctgctggtgc agaacgccaa ccccgactgc aagaccatcc 4080 
tgaaggctct cggccccgcg gccaccctgg aggagatgat gaccgcctgc cagggcgtgg 4140 
gcggccccgg ccacaaggcc cgcgtgctgg ccgaggcgat gagccaggtg acgaacccgg 4200 
cgaccatcat gatgcagcgc ggcaacttcc gcaaccagcg gaagaccgtc aagtgcttca 4260 
actgcggcaa ggagggccac accgccagga actgccgcgc cccccgcaag aagggctgct 432 0 
ggcgctgcgg ccgcgagggc caccagatga aggactgcac cgagcgccag gccaacttcc 43 80 
tgggcaagat ctggcccagc tacaagggcc gccccggcaa cttcctgcag agccgccccg 4440 
agcccaccgc cccccccgag gagagcttcc gcttcggcga ggagaagacc acccccagcc 4500 
agaagcagga gcccatcgac aaggagctgt accccctgac cagcctgcgc agcctgttcg 4560 
gcaacgaccc cagcagccag taagaattca gactcgagca agtctaga 4608 

<210> 77 
<211> 1680 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 77 

cccattagtc ctattgaaac tgtaccagta aaattaaagc caggaatgga tggcccaaaa 60 
gttaagcaat ggccattgac agaagaaaaa ataaaagcat tagtagagat atgtacagaa 12 0 
atggaaaagg aagggaaaat ttcaaaaatt gggcctgaaa atccatacaa tactccagta 18 0 
tttgctataa agaaaaaaga cagtactaaa tggagaaaac tagtagattt cagagaactt 240 
aataaaagaa ctcaagactt ctgggaagtt cagttaggaa taccacaccc cgcagggtta 3 00 
aaaaagaaaa aatcagtaac agtattggat gtgggtgatg catacttttc agttccctta 36 0 
gataaagact ttagaaagta tactgcattt accataccta gtataaacaa tgagacacca 42 0 
gggattagat atcagtacaa tgtgctgcca cagggatgga aaggatcacc agcaatattc 4 80 
caaagtagca tgacaaaaat cttagagcct tttagaaaac agaatccaga catagttatc 540 
tatcaataca tggatgattt gtatgtagga tctgacttag aaatagggca gcatagaaca 60 0 
aaaatagagg aactgagaca gcatctgttg aggtggggat ttaccacacc agacaaaaaa 660 
catcagaaag aacctccatt cctttggatg ggttatgaac tccatcctga taaatggaca 720 
gtacagccta taatgctgcc agaaaaagac agctggactg tcaatgacat acagaagtta 780 
gtgggaaaat tgaattgggc aagtcagatt tatgcaggga ttaaagtaaa gcagttatgt 84 0 
aaactcctta gaggaaccaa agcactaaca gaagtaatac cactaacaga agaagcagag 900 
ctagaactgg cagaaaacag ggagattcta aaagaaccag tacatgaagt atattatgac 960 
ccatcaaaag acttagtagc agaaatacag aagcaggggc aaggccaatg gacatatcaa 102 0 
atttatcaag agccatttaa aaatctgaaa acaggaaagt atgcaaggat gaggggtgcc 1080 
cacactaatg atgtaaaaca gttaacagag gcagtgcaaa aagtatccac agaaagcata 114 0 
gtaatatggg gaaagattcc taaatttaaa ctacccatac aaaaggaaac atgggaagca 12 00 
tggtggatgg agtattggca agctacctgg attcctgagt gggagtttgt caatacccct 1260 
cccttagtga aattatggta ccagttagag aaagaaccca tagtaggagc agaaactttc 132 0 
tatgtagatg gggcagctaa tagggagact aaattaggaa aagcaggata tgttactgac 13 80 
agaggaagac aaaaagttgt ctccatagct gacacaacaa atcagaagac tgaattacaa 1440 
gcaattcatc tagctttgca ggattcggga ttagaagtaa acatagtaac agactcacaa 1500 
tatgcattag gaatcattca agcacaacca gataagagtg aatcagagtt agtcagtcaa 156 0 
ataatagagc agttaataaa aaaggaaaag gtctacctgg catgggtacc agcacacaaa 162 0 
ggaattggag gaaatgaaca agtagataaa ttagtcagtg ctggaatcag gaaagtacta 1680 

<210> 78 
<211> 1865 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: GP1 



<400> 78 

gtcgacgcca ccatgggcgc ccgcgccagc gtgctgagcg gcggcgagct ggacaagtgg 60 
gagaagatcc gcctgcgccc cggcggcaag aagaagtaca agctgaagca catcgtgtgg 12 0 
gccagccgcg agctggagcg cttcgccgtg aaccccggcc tgctggagac cagcgagggc 180 
tgccgccaga tcctgggcca gctgcagccc agcctgcaga ccggcagcga ggagctgcgc 240 
agcctgtaca acaccgtggc caccctgtac tgcgtgcacc agcgcatcga cgtcaaggac 3 00 
accaaggagg ccctggagaa gatcgaggag gagcagaaca agtccaagaa gaaggcccag 360 
caggccgccg ccgccgccgg caccggcaac agcagccagg tgagccagaa ctaccccatc 420 ~ 
gtgcagaacc tgcagggcca gatggtgcac caggccatca gcccccgcac cctgaacgcc 4 80 
tgggtgaagg tggtggagga gaaggccttc agccccgagg tgatccccat gttcagcgcc 540 
ctgagcgagg gcgccacccc ccaggacctg aacacgatgt tgaacaccgt gggcggccac 600 
caggccgcca tgcagatgct gaaggagacc atcaacgagg aggccgccga gtgggaccgc 660 
gtgcaccccg tgcacgccgg ccccatcgcc cccggccaga tgcgcgagcc ccgcggcagc 720 
gacatcgccg gcaccaccag caccctgcag gagcagatcg gctggatgac caacaacccc 780 
cccatccccg tgggcgagat ctacaagcgg tggatcatcc tgggcctgaa caagatcgtg 840 
cggatgtaca gccccaccag catcctggac atccgccagg gccccaagga gcccttccgc 900 
gactacgtgg accgcttcta caagaccctg cgcgctgagc aggccagcca ggacgtgaag 960 
aactggatga ccgagaccct gctggtgcag aacgccaacc ccgactgcaa gaccatcctg 1020 
aaggctctcg gccccgcggc caccctggag gagatgatga ccgcctgcca gggcgtgggc 1080 
ggccccggcc acaaggcccg cgtgctggcc gaggcgatga gccaggtgac gaacccggcg 1140 
accatcatga tgcagcgcgg caacttccgc aaccagcgga agaccgtcaa gtgcttcaac 120 0 
tgcggcaagg agggccacac cgccaggaac tgccgcgccc cccgcaagaa gggctgctgg 1260 
cgctgcggcc gcgaaggaca ccaaatgaaa gattgcactg agagacaggc taatttttta 132 0 
gggaagatct ggccttccta caagggaagg ccagggaatt ttcttcagag cagaccagag 13 80 
ccaacagccc caccagaaga gagcttcagg tttggggagg agaaaacaac tccctctcag 144 0 
aagcaggagc cgatagacaa ggaactgtat cctttaactt ccctcagatc actctttggc 15 00 
aacgacccct cgtcacagta aggatcggcg gccagctcaa ggaggcgctg ctcgacaccg 1560 
gcgccgacga caccgtgctg gaggagatga acctgcccgg caagtggaag cccaagatga 1620 
tcggcgggat cgggggcttc atcaaggtgc ggcagtacga ccagatcccc gtggagatct 1680 
gcggccacaa ggccatcggc accgtgctgg tgggccccac ccccgtgaac atcatcggcc 1740 
gcaacctgct gacccagatc ggctgcaccc tgaacttccc catcagcccc atcgagacgg 1800 
tgcccgtgaa gctgaagccg gggatggacg gccccaaggt caagcagtgg cccctgtaag 1860 
aattc 1865 



<210> 79 
<211> 1865 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: GP2 



<400> 79 

gtcgacgcca ccatgggcgc ccgcgccagc 
gagaagatcc gcctgcgccc cggcggcaag 
gccagccgcg agctggagcg cttcgccgtg 
tgccgccaga tcctgggcca gctgcagccc 
agcctgtaca acaccgtggc caccctgtac 
accaaggagg ccctggagaa gatcgaggag 
caggccgccg ccgccgccgg caccggcaac 
gtgcagaacc tgcagggcca gatggtgcac 



gtgctgagcg gcggcgagct ggacaagtgg 60 
aagaagtaca agctgaagca catcgtgtgg 12 0 
aaccccggcc tgctggagac cagcgagggc 180 
agcctgcaga ccggcagcga ggagctgcgc 240 
tgcgtgcacc agcgcatcga cgtcaaggac 3 00 
gagcagaaca agtccaagaa gaaggcccag 3 60 
agcagccagg tgagccagaa ctaccccatc 42 0 
caggccatca gcccccgcac cctgaacgcc 480 
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tgggtgaagg tggtggagga gaaggccttc 
ctgagcgagg gcgccacccc ccaggacctg 
caggccgcca tgcagatgct gaaggagacc 
gtgcaccccg tgcacgccgg ccccatcgcc 
gacatcgccg gcaccaccag caccctgcag 
cccatccccg tgggcgagat ctacaagcgg 
cggatgtaca gccccaccag catcctggac 
gactacgtgg accgcttcta caagaccctg 
aactggatga ccgagaccct gctggtgcag 
aaggctctcg gccccgcggc caccctggag 
ggccccggcc acaaggcccg cgtgctggcc 
accatcatga tgcagcgcgg caacttccgc 
tgcggcaagg agggccacac cgccaggaac 
cgctgcggcc gcgaaggaca ccaaatgaaa 
gggaagatct ggccttccta caagggaagg 
ccaacagccc caccagaaga gagcttcagg 
aagcaggagc cgatagacaa ggaactgtat 
aacgacccct cgtcacagta aggatcgggg 
gagcagatga tacagtatta gaagaaatga 
taggggggat cgggggcttc atcaaggtga 
gtggacataa agctataggt acagtattag 
gaaatctgtt gacccagatc ggctgcacct 
tgcccgtgaa gttgaagccg gggatggacg 
aattc 



agccccgagg tgatccccat gttcagcgcc 540 
aacacgatgt tgaacaccgt gggcggccac 600 
atcaacgagg aggccgccga gtgggaccgc 660 
cccggccaga tgcgcgagcc ccgcggcagc 720 
gagcagatcg gctggatgac caacaacccc 780 
tggatcatcc tgggcctgaa caagatcgtg 840 
atccgccagg gccccaagga gcccttccgc 900 
cgcgctgagc aggccagcca ggacgtgaag 960 
aacgccaacc ccgactgcaa gaccatcctg 102 0 
gagatgatga ccgcctgcca gggcgtgggc 10 80 
gaggcgatga gccaggtgac gaacccggcg 114P 
aaccagcgga agaccgtcaa gtgcttcaac 12 00 
tgccgcgccc cccgcaagaa gggctgctgg 1260 
gattgcactg agagacaggc taatttttta 1320 
ccagggaatt ttcttcagag cagaccagag 13 80 
tttggggagg agaaaacaac tccctctcag 1440 
cctttaactt ccctcagatc actctttggc 1500 
ggcaactcaa ggaagcgctg ctcgatacag 1560 
atttgccagg aaaatggaaa ccaaaaatga 162 0 
ggcagtacga ccagatacct gtagaaatct 1680 
taggacctac acctgtcaac ataattggaa 174 0 
tgaacttccc catcagccct attgagacgg 180 0 
gccccaaggt caagcaatgg ccattgtaag 1860 

1865 



<210> 80 
<211> 2305 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
FS(+) . proinact . RTopt . YM 



<400> 80 

gcggccgcga aggacaccaa atgaaagatt 
agatctggcc ttcctacaag ggaaggccag 
cagccccacc agaagagagc ttcaggtttg 
aggagccgat agacaaggaa ctgtatcctt 
acccctcgtc acaataagga tcggggggca 
agatgataca gtattagaag aaatgaattt 
9999 atc 999 ggcttcatca aggtgaggca 
acataaagct ataggtacag tattagtagg 
tctgttgacc cagatcggct gcaccttgaa 
cgtgaagttg aagccgggga tggacggccc 
gaagatcaag gccctggtgg agatctgcac 
gatcggcccc gagaacccct acaacacccc 
caagtggcgc aagctggtgg acttccgcga 
ggtgcagctg ggcatccccc accccgccgg 
ggacgtgggc gacgcctact tcagcgtgcc 
cttcaccatc cccagcatca acaacgagac 
gccccagggc tggaagggca gccccgccat 
gcccttccgc aagcagaacc ccgacatcgt 
cgacctggag atcggccagc accgcaccaa 



gcactgagag acaggctaat tttttaggga 60 
ggaattttct tcagagcaga ccagagccaa 120 
gggaggagaa aacaactccc tctcagaagc 180 
taacttccct cagatcactc tttggcaacg 240 
actcaaggaa gcgctgctcg atacaggagc 3 00 
gccaggaaaa tggaaaccaa aaatgatagg 360 
gtacgaccag atacctgtag aaatctgtgg 42 0 
acctacacct gtcaacataa ttggaagaaa 480 
cttccccatc agccctattg agacggtgcc 54 0 
caaggtcaag caatggccat tgaccgagga 600 
cgagatggag aaggagggca agate age aa 660 
cgtgttcgcc atcaagaaga aggacagcac 72 0 
gctgaacaag cgcacccagg acttctggga 78 0 
cctgaagaag aagaagagcg tgaccgtgct 840 
cctggacaag gacttccgca agtacaccgc 900 
ccccggcatc cgctaccagt acaaegtget 960 
cttccagagc agcatgacca agatcctgga 1020 
gatctaccag gcccccctgt acgtgggcag 108 0 
gatcgaggag ctgcgccagc acctgctgcg 1140 
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ctggggcttc accacccccg acaagaagca 
ctacgagctg caccccgaca agtggaccgt 
ctggaccgtg aacgacatcc agaagctggt 
cgccggcatc aaggtgaagc agctgtgcaa 
ggtgatcccc ctgaccgagg aggccgagct 
ggagcccgtg cacgaggtgt actacgaccc 
gcagggccag ggccagtgga cctaccagat 
cggcaagtac gcccgcatgc gcggcgccca 
cgtgcagaag gtgagcaccg agagcatcgt 
gcccatccag aaggagacct gggaggcctg 
ccccgagtgg gagttcgtga acaccccccc 
ggagcccatc gtgggcgccg agaccttcta 
gctgggcaag gccggctacg tgaccgaccg 
caccaccaac cagaagaccg agctgcaggc 
ggaggtgaac atcgtgaccg acagccagta 
caagagcgag agcgagctgg tgagccagat 
gtacctggcc tgggtgcccg cccacaaggg 
ggtgagcgcc ggcatccgca aggtgctgtt 
ctaccagtac atggacgacc tgtacgtggg 
tcccggggct agcaccggtg aattc 

<210> 81 
<211> 2299 
<212> DNA 

<213> Artificial Sequence 



ccagaaggag ccccccttcc tgtggatggg 1200 
gcagcccatc atgctgcccg agaaggacag 1260 
gggcaagctg aactgggcca gccagatcta 1320 
gctgctgcgc ggcaccaagg ccctgaccga 1380 
ggagctggcc gagaaccgcg agatcctgaa 144 0 
cagcaaggac ctggtggccg agatccagaa 1500 
ctaccaggag cccttcaaga acctgaagac 1560 
caccaacgac gtgaagcagc tgaccgaggc 162 0 
gatctggggc aagatcccca agttcaagct 1680 
gtggatggag tactggcagg ccacctggat 1740 
cctggtgaag ctgtggtacc agctggagaa 1800 
cgtggacggc gccgccaacc gcgagaccaa 1860 
gggccggcag aaggtggtga gcatcgccga 192 0 
catccacctg gccctgcagg acagcggcct 1980 
cgccctgggc atcatccagg cccagcccga 2 040 
catcgagcag ctgatcaaga aggagaaggt 2100 
catcggcggc aacgagcagg tggacaagct 216 0 
cctgaacggc atcgatggcg gcatcgtgat 2220 
cagcggcggc cctaggatcg attaaaagct 2280 

2305 



<220> 

<223> Description of Artificial Sequence: 
FS ( + ) . proinact . RTopt . YMWM 



<400> 81 

gcggccgcga aggacaccaa atgaaagatt 
agatctggcc ttcctacaag ggaaggccag 
cagccccacc agaagagagc ttcaggtttg 
aggagccgat agacaaggaa ctgtatcctt 
acccctcgtc acaataagga tcggggggca 
agatgataca gtattagaag aaatgaattt 
ggggatcggg ggcttcatca aggtgaggca 
acataaagct ataggtacag tattagtagg 
tctgttgacc cagatcggct gcaccttgaa 
cgtgaagttg aagccgggga tggacggccc 
gaagatcaag gccctggtgg agatctgcac 
gatcggcccc gagaacccct acaacacccc 
caagtggcgc aagctggtgg acttccgcga 
ggtgcagctg ggcatccccc accccgccgg 
ggacgtgggc gacgcctact tcagcgtgcc 
cttcaccatc cccagcatca acaacgagac 
gccccagggc tggaagggca gccccgccat 
gcccttccgc aagcagaacc ccgacatcgt 
cgacctggag atcggccagc accgcaccaa 
ctggggcttc accacccccg acaagaagca 
gctgcacccc gacaagtgga ccgtgcagcc 
cgtgaacgac atccagaagc tggtgggcaa 
catcaaggtg aagcagctgt gcaagctgct 



gcactgagag acaggctaat tttttaggga 60 
ggaattttct tcagagcaga ccagagccaa 120 
gggaggagaa aacaactccc tctcagaagc 180 
taacttccct cagatcactc tttggcaacg 240 
actcaaggaa gcgctgctcg atacaggagc 3 00 
gccaggaaaa tggaaaccaa aaatgatagg 360 
gtacgaccag atacctgtag aaatctgtgg 42 0 
acctacacct gtcaacataa ttggaagaaa 480 
cttccccatc agccctattg agacggtgcc 540 
caaggtcaag caatggccat tgaccgagga 600 
cgagatggag aaggagggca agatcagcaa 660 
cgtgttcgcc atcaagaaga aggacagcac 720 
gctgaacaag cgcacccagg acttctggga 780 
cctgaagaag aagaagagcg tgaccgtgct 840 
cctggacaag gacttccgca agtacaccgc 900 
ccccggcatc cgctaccagt acaacgtgct 960 
cttccagagc agcatgacca agatcctgga 1020 
gatctaccag gcccccctgt acgtgggcag 1080 
gatcgaggag ctgcgccagc acctgctgcg 114 0 
ccagaaggag ccccccttcc tgcccatcga 1200 
catcatgctg cccgagaagg acagctggac 12 60 
gctgaactgg gccagccaga tctacgccgg 132 0 
gcgcggcacc aaggccctga ccgaggtgat 13 8 0 
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ccccctgacc gaggaggccg agctggagct ggccgagaac cgcgagatcc tgaaggagcc 144 0 
cgtgcacgag gtgtactacg accccagcaa ggacctggtg gccgagatcc agaagcaggg 1500 
ccagggccag tggacctacc agatctacca ggagcccttc aagaacctga agaccggcaa 1560 
gtacgcccgc atgcgcggcg cccacaccaa cgacgtgaag cagctgaccg aggccgtgca 1620 
gaaggtgagc accgagagca tcgtgatctg gggcaagatc cccaagttca agctgcccat 1680 
ccagaaggag acctgggagg cctggtggat ggagtactgg caggccacct ggatccccga 1740 
gtgggagttc gtgaacaccc cccccctggt gaagctgtgg taccagctgg agaaggagcc 180 0 
catcgtgggc gccgagacct tctacgtgga cggcgccgcc aaccgcgaga ccaagctggg 1860 
caaggccggc tacgtgaccg accggggccg gcagaaggtg gtgagcatcg ccgacaccac 1920 
caaccagaag accgagctgc aggccatcca cctggccctg caggacagcg gcctggaggt 1980 
gaacatcgtg accgacagcc agtacgccct gggcatcatc caggcccagc ccgacaagag 2 04 0 
cgagagcgag ctggtgagcc agatcatcga gcagctgatc aagaaggaga aggtgtacct 2100 
ggcctgggtg cccgcccaca agggcatcgg cggcaacgag caggtggaca agctggtgag 2160 
cgccggcatc cgcaaggtgc tgttcctgaa cggcatcgat ggcggcatcg tgatctacca 222 0 
gtacatggac gacctgtacg tgggcagcgg cggccctagg atcgattaaa agcttcccgg 2280 
ggctagcacc ggtgaattc 2299 

<210> 82 
<211> 2306 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
FS { - ) . protmod . RTopt . YM 

<400> 82 

gcggccgcga aggacaccaa atgaaagatt gcactgagag acaggctaat ttcttccgcg 60 
aggacctggc cttcctgcag ggcaaggccc gcgagttcag cagcgagcag acccgcgcca 12 0 
acagccccac ccgccgcgag ctgcaggtgt ggggcggcga gaacaacagc ctgagcgagg 180 
ccggcgccga ccgccagggc accgtgagct tcaacttccc ccagatcacc ctgtggcagc 240 
gccccctggt gaccatcagg atcggcggcc agctcaagga ggcgctgctc gacaccggcg 300 
ccgacgacac cgtgctggag gagatgaacc tgcccggcaa gtggaagccc aagatgatcg 360 
gcgggatcgg gggcttcatc aaggtgcggc agtacgacca gatccccgtg gagatctgcg 42 0 
gccacaaggc catcggcacc gtgctggtgg gccccacccc cgtgaacatc atcggccgca 480 
acctgctgac ccagatcggc tgcaccctga acttccccat cagccccatc gagacggtgc 540 
ccgtgaagct gaagccgggg atggacggcc ccaaggtcaa gcagtggccc ctgaccgagg 600 
agaagatcaa ggccctggtg gagatctgca ccgagatgga gaaggagggc aagatcagca 660 
agatcggccc cgagaacccc tacaacaccc ccgtgttcgc catcaagaag aaggacagca 72 0 
ccaagtggcg caagctggtg gacttccgcg agctgaacaa gcgcacccag gacttctggg 780 
aggtgcagct gggcatcccc caccccgccg gcctgaagaa gaagaagagc gtgaccgtgc 84 0 
tggacgtggg cgacgcctac ttcagcgtgc ccctggacaa ggacttccgc aagtacaccg 900 
ccttcaccat ccccagcatc aacaacgaga cccccggcat ccgctaccag tacaacgtgc 960 
tgccccaggg ctggaagggc agccccgcca tcttccagag cagcatgacc aagatcctgg 102 0 
agcccttccg caagcagaac cccgacatcg tgatctacca ggcccccctg tacgtgggca 1080 
gcgacctgga gatcggccag caccgcacca agatcgagga gctgcgccag cacctgctgc 1140 
gctggggctt caccaccccc gacaagaagc accagaagga gccccccttc ctgtggatgg 1200 
gctacgagct gcaccccgac aagtggaccg tgcagcccat catgctgccc gagaaggaca 126 0 
gctggaccgt gaacgacatc cagaagctgg tgggcaagct gaactgggcc agccagatct 132 0 
acgccggcat caaggtgaag cagctgtgca agctgctgcg cggcaccaag gccctgaccg 13 80 
aggtgatccc cctgaccgag gaggccgagc tggagctggc cgagaaccgc gagatcctga 1440 
aggagcccgt gcacgaggtg tactacgacc ccagcaagga cctggtggcc gagatccaga 1500 
agcagggcca gggccagtgg acctaccaga tctaccagga gcccttcaag aacctgaaga 1560 
ccggcaagta cgcccgcatg cgcggcgccc acaccaacga cgtgaagcag ctgaccgagg 1620 



57 



ccgtgcagaa ggtgagcacc gagagcatcg 
tgcccatcca gaaggagacc tgggaggcct 
tccccgagtg ggagttcgtg aacacccccc 
aggagcccat cgtgggcgcc gagaccttct 
agctgggcaa ggccggctac gtgaccgacc 
acaccaccaa ccagaagacc gagctgcagg 
tggaggtgaa catcgtgacc gacagccagt 
acaagagcga gagcgagctg gtgagccaga 
tgtacctggc ctgggtgccc gcccacaagg 
tggtgagcgc cggcatccgc aaggtgctgt 
tctaccagta catggacgac ctgtacgtgg 
ttcccggggc tagcaccggt gaattc 

<210> 83 
<211> 2300 
<212> DNA 

<213> Artificial Sequence 



tgatctgggg caagatcccc aagttcaagc 168 0 
ggtggatgga gtactggcag gccacctgga 1740 
ccctggtgaa gctgtggtac cagctggaga 1800 
acgtggacgg cgccgccaac cgcgagacca 1860 
ggggccggca gaaggtggtg agcatcgccg 1920 
ccatccacct ggccctgcag gacagcggcc 1980 
acgccctggg catcatccag gcccagcccg 2040 
tcatcgagca gctgatcaag aaggagaagg 2100 
gcatcggcgg caacgagcag gtggacaagc 2160 
tcctgaacgg catcgatggc ggcatcgtga 222 0 
gcagcggcgg ccctaggatc gattaaaagc 2280 

2306 



<220> 

<223> Description of Artificial Sequence: 
FS ( - ) . protmod . RTopt . YMWM 



<400> 83 

gcggccgcga aggacaccaa atgaaagatt 
aggacctggc cttcctgcag ggcaaggccc 
acagccccac ccgccgcgag ctgcaggtgt 
ccggcgccga ccgccagggc accgtgagct 
gccccctggt gaccatcagg atcggcggcc 
ccgacgacac cgtgctggag gagatgaacc 
gcggg^tcgg gggcttcatc aaggtgcggc 
gccacaaggc catcggcacc gtgctggtgg 
acctgctgac ccagatcggc tgcaccctga 
ccgtgaagct gaagccgggg atggacggcc 
agaagatcaa ggccctggtg gagatctgca 
agatcggccc cgagaacccc tacaacaccc 
ccaagtggcg caagctggtg gacttccgcg 
aggtgcagct gggcatcccc caccccgccg 
tggacgtggg cgacgcctac ttcagcgtgc 
ccttcaccat ccccagcatc aacaacgaga 
tgccccaggg ctggaagggc agccccgcca 
agcccttccg caagcagaac cccgacatcg 
gcgacctgga gatcggccag caccgcacca 
gctggggctt caccaccccc gacaagaagc 
agctgcaccc cgacaagtgg accgtgcagc 
ccgtgaacga catccagaag ctggtgggca 
gcatcaaggt gaagcagctg tgcaagctgc 
tccccctgac cgaggaggcc gagctggagc 
ccgtgcacga ggtgtactac gaccccagca 
gccagggcca gtggacctac cagatctacc 
agtacgcccg catgcgcggc gcccacacca 
agaaggtgag caccgagagc atcgtgatct 
tccagaagga gacctgggag gcctggtgga 
agtgggagtt cgtgaacacc ccccccctgg 
ccatcgtggg cgccgagacc ttctacgtgg 



gcactgagag acaggctaat ttcttccgcg 60 
gcgagttcag cagcgagcag acccgcgcca 12 0 
999gcggcga gaacaacagc ctgagcgagg 180 
tcaacttccc ccagatcacc ctgtggcagc 240 
agctcaagga ggcgctgctc gacaccggcg 3 00 
tgcccggcaa gtggaagccc aagatgatcg 3 60 
agtacgacca gatccccgtg gagatctgcg 42 0 
gccccacccc cgtgaacatc atcggccgca 480 
acttccccat cagccccatc gagacggtgc 54 0 
ccaaggtcaa gcagtggccc ctgaccgagg 600 
ccgagatgga gaaggagggc aagatcagca 660 
ccgtgttcgc catcaagaag aaggacagca 720 
agctgaacaa gcgcacccag gacttctggg 780 
gcctgaagaa gaagaagagc gtgaccgtgc 840 
ccctggacaa ggacttccgc aagtacaccg 900 
cccccggcat ccgctaccag tacaacgtgc 960 
tcttccagag cagcatgacc aagatcctgg 102 0 
tgatctacca ggcccccctg tacgtgggca 1080 
agatcgagga gctgcgccag cacctgctgc 1140 
accagaagga gccccccttc ctgcccatcg 1200 
ccatcatgct gcccgagaag gacagctgga 1260 
agctgaactg ggccagccag atctacgccg 132 0 
tgcgcggcac caaggccctg accgaggtga 13 8 0 
tggccgagaa ccgcgagatc ctgaaggagc 144 0 
aggacctggt ggccgagatc cagaagcagg 1500 
aggagccctt caagaacctg aagaccggca 1560 
acgacgtgaa gcagctgacc gaggccgtgc 1620 
ggggcaagat ccccaagttc aagctgccca 168 0 
tggagtactg gcaggccacc tggatccccg 1740 
tgaagctgtg gtaccagctg gagaaggagc 1800 
acggcgccgc caaccgcgag accaagctgg 1860 
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gcaaggccgg ctacgtgacc gaccggggcc 
ccaaccagaa gaccgagctg caggccatcc 
tgaacatcgt gaccgacagc cagtacgccc 
gcgagagcga gctggtgagc cagatcatcg 
tggcctgggt gcccgcccac aagggcatcg 
gcgccggcat ccgcaaggtg ctgttcctga 
agtacatgga cgacctgtac gtgggcagcg 
gggctagcac cggtgaattc 

<210> 84 
<211> 2312 
<212> DNA 

<213> Artificial Sequence 



ggcagaaggt ggtgagcatc gccgacacca 1920 
acctggccct gcaggacagc ggcctggagg 1980 
tgggcatcat ccaggcccag cccgacaaga 2040 
agcagctgat caagaaggag aaggtgtacc 210 0 
gcggcaacga gcaggtggac aagctggtga 2160 
acggcatcga tggcggcatc gtgatctacc 222 0 
gcggccctag gatcgattaa aagcttcccg 22 80 

2300 



<220> 

<223> Description of Artificial Sequence: 
FS ( - ) . protmod . RTop t ( + ) 



<400> 84 

gcggccgcga aggacaccaa atgaaagatt 
aggacctggc cttcctgcag ggcaaggccc 
acagccccac ccgccgcgag ctgcaggtgt 
ccggcgccga ccgccagggc accgtgagct 
gccccctggt gaccatcagg atcggcggcc 
ccgacgacac cgtgctggag gagatgaacc 
gcgggatcgg gggcttcatc aaggtgcggc 
gccacaaggc catcggcacc gtgctggtgg 
acctgctgac ccagatcggc tgcaccctga 
ccgtgaagct gaagccgggg atggacggcc 
agaagatcaa ggccctggtg gagatctgca 
agatcggccc cgagaacccc tacaacaccc 
ccaagtggcg caagctggtg gacttccgcg 
aggtgcagct gggcatcccc caccccgccg 
tggacgtggg cgacgcctac ttcagcgtgc 
ccttcaccat ccccagcatc aacaacgaga 
tgccccaggg ctggaagggc agccccgcca 
agcccttccg caagcagaac cccgacatcg 
tgggcagcga cctggagatc ggccagcacc 
tgctgcgctg gggcttcacc acccccgaca 
ggatgggcta cgagctgcac cccgacaagt 
aggacagctg gaccgtgaac gacatccaga 
agatctacgc cggcatcaag gtgaagcagc 
tgaccgaggt gatccccctg accgaggagg 
tcctgaagga gcccgtgcac gaggtgtact 
tccagaagca gggccagggc cagtggacct 
tgaagaccgg caagtacgcc cgcatgcgcg 
ccgaggccgt gcagaaggtg agcaccgaga 
tcaagctgcc catccagaag gagacctggg 
cctggatccc cgagtgggag ttcgtgaaca 
tggagaagga gcccatcgtg ggcgccgaga 
agaccaagct gggcaaggcc ggctacgtga 
tcgccgacac caccaaccag aagaccgagc 
gcggcctgga ggtgaacatc gtgaccgaca 
agcccgacaa gagcgagagc gagctggtga 



gcactgagag acaggctaat ttcttccgcg 60 
gcgagttcag cagcgagcag acccgcgcca 120 
ggggcggcga gaacaacagc ctgagcgagg 180 
tcaacttccc ccagatcacc ctgtggcagc 24 0 
agctcaagga ggcgctgctc gacaccggcg 3 00 
tgcccggcaa gtggaagccc aagatgatcg 360 
agtacgacca gatccccgtg gagatctgcg 42 0 
gccccacccc cgtgaacatc atcggccgca 480 
acttccccat cagccccatc gagacggtgc 540 
ccaaggtcaa gcagtggccc ctgaccgagg 600 
ccgagatgga gaaggagggc aagatcagca 660 
ccgtgttcgc catcaagaag aaggacagca 720 
agctgaacaa gcgcacccag gacttctggg 7 80 
gcctgaagaa gaagaagagc gtgaccgtgc 840 
ccctggacaa ggacttccgc aagtacaccg 900 
cccccggcat ccgctaccag tacaacgtgc 960 
tcttccagag cagcatgacc aagatcctgg 102 0 
tgatctacca gtacatggac gacctgtacg 1080 
gcaccaagat cgaggagctg cgccagcacc 1140 
agaagcacca gaaggagccc cccttcctgt 1200 
ggaccgtgca gcccatcatg ctgcccgaga 1260 
agctggtggg caagctgaac tgggccagcc 1320 
tgtgcaagct gctgcgcggc accaaggccc 1380 
ccgagctgga gctggccgag aaccgcgaga 1440 
acgaccccag caaggacctg gtggccgaga 1500 
accagatcta ccaggagccc ttcaagaacc 1560 
gcgcccacac caacgacgtg aagcagctga 1620 
gcatcgtgat ctggggcaag atccccaagt 1680 
aggcctggtg gatggagtac tggcaggcca 1740 
ccccccccct ggtgaagctg tggtaccagc 1800 
ccttctacgt ggacggcgcc gccaaccgcg 1860 
ccgaccgggg ccggcagaag gtggtgagca 1920 
tgcaggccat ccacctggcc ctgcaggaca 1980 
gccagtacgc cctgggcatc atccaggccc 2 04 0 
gccagatcat cgagcagctg atcaagaagg 2100 
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agaaggtgta cctggcctgg gtgcccgccc acaagggcat cggcggcaac gagcaggtgg 2160 
acaagctggt gagcgccggc atccgcaagg tgctgttcct gaacggcatc gatggcggca 2220 
tcgtgatcta ccagtacatg gacgacctgt acgtgggcag cggcggccct aggatcgatt 2280 
aaaagcttcc cggggctagc accggtgaat tc 2312 

<210> 85 
<211> 306 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 85 

atggagccag tagatcctag attagagccc tggaagcatc caggaagtca gcctaagact 60 
gcttgtacaa attgctattg taaaaagtgt tgctttcatt gccaagtttg tttcataaca 120 
aaaggcttag gcatctccta tggcaggaag aagcggagac agcgacgaag agctcctcca 180 
gacagtgagg ttcatcaagt ttctctacca aagcaacccg cttcccagcc ccaaggggac 240 
ccgacaggcc cgaaggaatc gaagaagaag gtggagagag agacagagac agate cagtc 3 00 
cattag 306 

<210> 86 
<211> 101 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 86 

Met Glu Pro Val Asp Pro Arg Leu Glu Pro Trp Lys His Pro Gly Ser 
15 10 15 

Gin Pro Lys Thr Ala Cys Thr Asn Cys Tyr Cys Lys Lys Cys Cys Phe 
20 25 30 

His Cys Gin Val Cys Phe He Thr Lys Gly Leu Gly He Ser Tyr Gly 
35 40 45 

Arg Lys Lys Arg Arg Gin Arg Arg Arg Ala Pro Pro Asp Ser Glu Val 
50 55 60 

His Gin Val Ser Leu Pro Lys Gin Pro Ala Ser Gin Pro Gin Gly Asp 
65 70 75 80 

Pro Thr Gly Pro Lys Glu Ser Lys Lys Lys Val Glu Arg Glu Thr Glu 
85 90 95 

Thr Asp Pro Val His 
100 



<210> 87 
<211> 306 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: tat . SF162 . opt 
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<400> 87 

atggagcccg tggacccccg cctggagccc tggaagcacc ccggcagcca gcccaagacc 60 
gcctgcacca actgctactg caagaagtgc tgcttccact gccaggtgtg cttcatcacc 12 0 
aagggcctgg gcatcagcta cggccgcaag aagcgccgcc agcgccgccg cgcccccccc 180 
gacagcgagg tgcaccaggt gagcctgccc aagcagcccg ccagccagcc ccagggcgac 240 
cccaccggcc ccaaggagag caagaagaag gtggagcgcg agaccgagac cgaccccgtg 3 00 
cactag 3 06 

<210> 88 
<211> 306 

<212> DNA i 
<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: 
tat . cys22 . SF162 . opt 

<400> 88 

atggagcccg tggacccccg cctggagccc tggaagcacc ccggcagcca gcccaagacc 60 
gccggcacca actgctactg caagaagtgc tgcttccact gccaggtgtg cttcatcacc 12 0 
aagggcctgg gcatcagcta cggccgcaag aagcgccgcc agcgccgccg cgcccccccc 18 0 
gacagcgagg tgcaccaggt gagcctgccc aagcagcccg ccagccagcc ccagggcgac 240 
cccaccggcc ccaaggagag caagaagaag gtggagcgcg agaccgagac cgaccccgtg 300 
cactag 306 

<210> 89 
<211> 168 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
tatamino . SF162 . opt 

<400> 89 

atggagcccg tggacccccg cctggagccc tggaagcacc ccggcagcca gcccaagacc 60 

gcctgcacca actgctactg caagaagtgc tgcttccact gccaggtgtg cttcatcacc 12 0 

aagcjgcctgg gcatcagcta cggccgcaag aagcgccgcc agcgccgc 168 

<210> 90 
<211> 102 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: tat cys22 
SF162 protein 

<400> 90 

Met Glu Pro Val Asp Pro Arg Leu Glu Pro Trp Lys His Pro Gly Ser 
15 10 15 

Gin Pro Lys Thr Ala Gly Thr Asn Cys Tyr Cys Lys Lys Cys Cys Phe 
20 25 30 
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His Cys Gin Val Cys Phe He Thr Lys Gly Leu Gly He Ser Tyr Gly 
35 40 45 

Arg Lys Lys Arg Arg Gin Arg Arg Arg Ala Pro Pro Asp Ser Glu Val 
50 55 60 

His Gin Val Ser Leu Pro Lys Gin Pro Ala Ser Gin Pro Gin Gly Asp 
65 70 75 80 

Pro Thr Gly Pro Lys Glu Ser Lys Lys Lys Val Glu Arg Glu Thr Glu 
85 90 95 

Thr Asp Pro Val His Glx 
100 
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